Abstract:
Machine learning algorithms are often used to extract rules and
structure from large datasets, and have been successfully used in
fields such as machine vision, natural language processing and
computational biology. With the growing availability of text and
images on the web, and high-throughput experiments in biology,
learning algorithms are becoming a key tool for organizing, searching,
and interpreting this data.
Learning algorithms are typically required to work with very large
datasets of high dimensional signals. Thus scalability of algorithms
is a key issue that must be addressed. In this talk I will show how
the concept of convex duality can be used to design simple and
scalable algorithms, and to help understand convergence of previously
existing ones.
I will first address the problem of probabilistic inference in
multivariate distributions, and show how convex duality results in
simple convergent message passing algorithms for this
problem. Specifically, I will show that approximate inference may be
cast as a geometric program, where coordinate descent yields message
passing updates. These results resolve a long standing problem
regarding the convergence of message passing algorithms for
approximate inference.
I will next address the problem of learning classifiers from labelled
data, and present an exponentiated gradient algorithm that is also
derived using convex duality. The algorithm can be shown to converge,
and its convergence rate can be analyzed. I will conclude by showing
an application of the algorithm to a large scale natural language
parsing
task, and demonstrate that it converges significantly faster than
previous state of the art algorithms.