Time+Place: Sunday 27/05/2007 10:30 Room 337-8 Taub Bld.
Title: Testing Independence Properties of Distributions
Speaker: Ronitt Rubinfeld http://theory.lcs.mit.edu/~ronitt
Affiliation: MIT
Host: Amir Shpilka

Abstract:

Suppose you are studying the occurrence of a disease and need 
to uncover any salient statistical properties that might hold. 
For example, it would be important to know if the probability 
of contracting the disease decreases with distance of your house 
from a given nuclear plant, whether the distribution on zipcodes 
of patients is close to the distribution for another disease, or 
whether a person's likelihood of contracting it is correlated 
with their profession.  Of course, you wish to notice such trends 
given as few samples as possible.

We survey several works regarding the complexity of testing 
various global properties of distributions, when given access to only 
a few samples from the distributions.  We will focus on properties 
that give some indication on the independence of an underlying
joint distribution.  We will describe algorithms whose sample 
complexities are *sublinear* in the size of the support for many such 
testing problems.  In contrast, classical techniques, such as the 
Chi-squared test or the straightforward use of Chernoff bounds, 
have sample complexities that are at least linear in the size 
of the support of the underlying discrete probability distributions.