Subspace clustering is the unsupervised learning problem of clustering a collection 
of data samples drawn from a union of subspaces, according to their spanning subspaces. 
State-of-the-art algorithms employ a self-expressive data model in order to construct 
a graph of relations between data samples, and partition the graph using spectral clustering.
These algorithms provide excellent performance for small to medium data collections, however, 
their polynomial complexity (in the number of data samples) prevents them from scaling up 
to large data collections with possibly millions of data samples. In this talk I will present 
linear complexity approaches to solve this problem that are based on sparse representations. 
The core idea of the proposed approaches is to model the relations between data samples to 
dictionary atoms (basis elements). The number of the atoms is fixed regardless of the number 
of data samples, thus, enabling to reduce significantly the overall complexity. Subspace 
clustering is obtained by two methods: the first employs a probability mixture model followed 
by a maximum-likelihood decision rule; and the second utilizes a bipartite graph modeling 
followed by bipartite graph clustering.