Amir Adler, Ph.D. Thesis Seminar
Thursday, 26.6.2014, 11:30
Subspace clustering is the unsupervised learning problem of clustering a collection
of data samples drawn from a union of subspaces, according to their spanning subspaces.
State-of-the-art algorithms employ a self-expressive data model in order to construct
a graph of relations between data samples, and partition the graph using spectral clustering.
These algorithms provide excellent performance for small to medium data collections, however,
their polynomial complexity (in the number of data samples) prevents them from scaling up
to large data collections with possibly millions of data samples. In this talk I will present
linear complexity approaches to solve this problem that are based on sparse representations.
The core idea of the proposed approaches is to model the relations between data samples to
dictionary atoms (basis elements). The number of the atoms is fixed regardless of the number
of data samples, thus, enabling to reduce significantly the overall complexity. Subspace
clustering is obtained by two methods: the first employs a probability mixture model followed
by a maximum-likelihood decision rule; and the second utilizes a bipartite graph modeling
followed by bipartite graph clustering.