Ever since the work of Cook and Levin, satisfiability has been
recognized as a central problem in computational complexity. It is
widely believed to be intractable, and yet till recently even a
linear-time logarithmic-space algorithm was not ruled out.
The last few years have seen considerable progress in time-space lower
bounds for satisfiability and closely related problems on various
models of computation. This talk will describe the state-of-the-art on
deterministic, randomized, and quantum machines, survey the underlying
ideas, and present some of the arguments involved.
Consider the ``plain'' multiparty communication complexity model,
where k players $P_1,\dots,P_k$ holding inputs
$x_1,\dots,x_k\in\bit^n$ (respectively) communicate in order to
compute the value $f(x_1,\dots,x_k)$. The main lower bound
technique for such communication problems is that of
partition arguments: partition the $k$-players into two disjoint
sets of players and find a lower bound for the induced two-party
communication complexity problem. In this paper, we study the
power of the partition arguments method. Our two main results are
very different in nature:
For {\em randomized} communication complexity we show that
partition arguments may be exponentially far from the true
communication complexity. That is, we prove that there exists a
$3$-argument function $f$ whose communication complexity is
$\Omega(n)$ but partition arguments can only yield an $\Omega(\log
n)$
lower bound. The same holds for nondeterministic communication
complexity.
For the case of {\em deterministic} communication
complexity, we prove that finding significant gaps, between the
true communication complexity and the best lower bound that can be
obtained via partition arguments, implies progress on
(a
generalized version of) the ``log rank conjecture'' of
communication complexity. This explains why partition arguments
seem to yield good lower bounds in this case.
Joint work with Eyal Kushilevitz
A function $f$ is $\delta$-hard for some class of circuits if every
circuit in the class errs on a $\delta$-fraction of the inputs. For
various classes of constant depth small size circuits there are explicit
functions that are $\delta$-hard for ``small'' values of $\delta$.
For stronger circuit classes there are many ``hardness amplification
theorems'' that show how to transform any $\delta$-hard function $f$
into a ``harder'' function $g$ that is $(1/2-\epsilon)$-hard (for very
small $\epsilon$). These theorems are proved using black-box reductions
that transform a circuit that computes $g$ correctly on a $1/2+\eps$
fraction of the inputs into a circuit that computes $f$ correctly on a
$1-\delta$-fraction of the inputs.
We show that such reductions cannot have low complexity:
1. Any such reduction ``can be used'' to compute the majority function
on $1/\epsilon$ bits.
2. Any such reduction must make $\Omega(\log(1/\delta)/\eps2)$ oracle
queries.
By the Natural Proofs paradigm of Razborov and Rudich (coupled with a
cryptographic construction of Naor and Reingold) ``current techniques''
cannot prove lower bounds against circuit classes that can compute the
majority function. Loosely speaking this means that: ``current
techniques can only amplify hardness that we don't have''.
The main difficulty in the proofs is that reductions that come up in
proofs of hardness amplification theorems are allowed to be
``nonuniform'' and we develop techniques to bound such reductions.
The talk will be self contained and I will give a brief survey of the area.
Joint work with Emanuele Viola.
Joint work with Tali Kaufman.
We study the model of approximation and calculation of multivariate
low degree
polynomials over finite fields, by lower degree
polynomials.
We prove that the models are in fact equivalent for constant
degree polynomials -
if a constant degree polynomial can be
non-trivially approximated by a
lower degree polynomial, then it can be exactly
computed by
a constant number of lower degree polynomials.
An important goal in algorithms and communication protocols is
getting
the deterministic time complexity, respectively communication complexity,
closer to the best known randomized complexity.
In this work, we investigate
the case where instead of one input, the algorithm \ protocol is given multiple
\emph{independent} samples from an \emph{arbitrary unknown} distribution.
We show that in this case a strong derandomization result can be obtained by a
simple argument. Our method involved extracting randomness from
`same-source'-distributions - distributions that consist
of multiple independent
samples from the same source - that have very low min-entropy.
Such extractors
enable us to get new non-trivial \emph{implicit probe schemes} \cite{FiaNao93}.
Using a slightly more complicated argument we get similar results for product
distributions consisting of several different distributions.
Joint work with Ariel Gabizon
We consider private data analysis in the setting in which a trusted and trustworthy curator,
having obtained a large data set containing private information, releases to the public a
``sanitization'' of the data set that simultaneously protects the privacy of the individual
contributors of data and offers utility to the data analyst. We focus on the case where the
process is non-interactive; once the sanitization has been released the original data and the
curator play no further role.
Blum et al. [STOC '08] showed a remarkable result: for any collection of counting predicate queries,
the exponential mechanism of McSherry and Talwar [FOCS '07] can be used to (inefficiently) generate
a synthetic data set that maintains approximately correct fractional counts for all of the queries,
while ensuring a strong privacy guarantee.
In this work we investigate the computational complexity of such non-interactive privacy mechanisms,
mapping the boundary between feasibility and infeasibility. We show:
1. For any polynomial-size query set C and polynomial-size data universe, an efficient algorithm
for generating a privacy-preserving synthetic data-set that maintains approximate fractional counts,
even when the size of the original dataset is sub-polynomial in |C|.
2. When either the query set or the data universe are super-polynomial, assuming one-way functions exist,
there is no efficient method for releasing a privacy-preserving synthetic data-set that maintains
approximate fractional counts. This also indicates that it is unlikely that the exponential mecahnism
can, in general, be implemented efficiently.
3. Turning to the potentially easier problem of privately generating an arbitrary data structure
(not necessarily synthetic data) from which it is possible to approximate counts, there is a tight
connection between hardness of sanitization and the existence of traitor tracing schemes.
Joint work with Cynthia Dwork, Moni naor, Omer Reingold and Salil Vadhan.
Computing the Fourier transform is a basic building block used in numerous applications.
For data intensive applications, even the O(N log N) running time of the Fast Fourier Transform (FFT)
algorithm may be too slow, and sub-linear running time is necessary. Clearly, outputting the entire
Fourier transform in sub-linear time is infeasible, nevertheless, in many applications it suffices
to find only the \tau-significant Fourier transform coefficients, that is, the Fourier coefficients
whose magnitude is at least \tau-fraction (say, 1%) of the energy (ie, the sum of squared Fourier coefficients).
We call algorithms achieving the latter SFT algorithms.
In this work we present a deterministic algorithm that finds the \tau-significant Fourier coefficients
of functions f over any finite abelian group G in time polynomial in log|G|, 1/\tau and L_1(f) (for L_1(f)
denoting the sum of absolute values of the Fourier coefficients of f). Our algorithm is robust to random noise.
Our algorithm is the first deterministic and efficient (ie, polynomial in log|G|) SFT algorithm to handle
functions over any finite abelian groups, as well as the first such algorithm to handle functions over Z_N
that are neither compressible nor Fourier-sparse. Our analysis is the first to show robustness to noise in
the context of deterministic SFT algorithms.
Using our SFT algorithm we obtain (1) deterministic (universal and explicit) algorithms for sparse Fourier
approximation, compressed sensing and sketching; (2) an algorithm solving the Hidden Number Problem with
advice, with cryptographic bit security implications; and (3) an efficient decoding algorithm in the random
noise model for polynomial rate variants of Homomorphism codes and any other concentrated & recoverable codes.
The Gowers uniformity norms U_k measure a certain kind of psuedo randomness.
For example, a function f on a finite (large) dimensional vector space V over a
finite field F has small U_2 norm if and only if all its Fourier coefficients are
small - i.e it has no significant correlation with linear phase functions.
The content of the inverse conjecture for the Gowers norms is that a similar relation
exists between the U_k norms and polynomials of degree smaller than k; namely a function f
has small U_k norm if and only if it has no significant correlation with a polynomial phase
function of degree smaller than k. This conjecture plays an important role in counting
solutions to systems of linear equations in subsets of vector spaces, and in polynomiality testing.
I will describe a proof of this conjecture in the case where char(F) is greater equal k,
using tools from ergodic theory. In the low characteristic case, we show that if f has small
U_k norm then it has no significant correlation with a polynomial phase function of bounded degree c(k).
I will define all relevant notions. Joint works with V. Bergelson and T. Tao.
We consider one-round games between a classical verifier and two provers who share entanglement.
We show that when the constraints enforced by the verifier are `unique' constraints (i.e., permutations),
the value of the game can be well approximated by a semidefinite program. Essentially the only
algorithm known previously was for the special case of binary answers, as follows from the work
of Tsirelson in 1980. Among other things, our result implies that the variant of the unique games
conjecture where we allow the provers to share entanglement is false. Our proof is based on a novel
`quantum rounding technique', showing how to take a solution to an SDP and transform it to a strategy
for entangled provers.
NOTE: This talk requires absolutely no prior knowledge of quantum computation.
Joint work with Julia Kempe and Ben Toner
We show a connection between the semi-definite relaxation of unique games and their behavior under parallel repetition.
Specifically, denoting by val(G) the value of a two-prover unique game G, and by sdpval(G) the value of a natural
semi-definite program to approximate val(G), we prove that for every large enough $\ell$, if sdpval(G) is at least
$1-\delta$, then the value of the $\ell$-times parallel repeated version of G is at least $sdpval(G)^{s \ell}$
where $s=O(\log(k/\delta))$ for general unique games and $s=O(1)$ for XOR games (i.e., k=2). By a result of
Feige and Lovasz (STOC '92), our bounds are tight up to a factor of $O(\log(1/\delta))$ in the exponent.
For games where there is a significant gap between the quantities val(G) and sdpval(G), our result implies
that $val(G^{\ell})$ may be much larger than $\val(G)^{\ell}$, giving a counterexample to the strong parallel
repetition conjecture. In a recent breakthrough, Raz has showed such an example using the max-cut game on odd cycles.
Our results are based on a generalization of his techniques.
Joint work with Boaz Barak, Moritz Hardt, Anup Rao, Oded Regev and David Steurer.
We consider the following question: given a two-argument boolean
function $f$, represented as an $N\times N$ binary matrix, how
hard is to determine the (deterministic) communication complexity
of $f$?
We address two aspects of this question. On the computational side,
we prove that, under appropriate cryptographic assumptions
(such as the intractability of factoring), the deterministic
communication complexity of $f$ is hard to approximate to within
some constant. Under stronger (yet arguably reasonable) assumptions,
we obtains even stronger hardness results that match the best
known approximation.
On the analytic side, we present a family of functions for
which determining the communication complexity (or even obtaining
non-trivial lower bounds on it) imply proving circuit lower bounds
for some corresponding problems. Such connections between circuit
complexity and communication complexity were known before
(Karchmer-Wigderson 1988) only in the more involved context of
relations (search problems) and not in the context of functions
(decision problems). This result, in particular, may explain the
difficulty of analyzing the communication complexity of certain
functions such as the ``clique vs. independent-set'' family of
functions, introduced by Yannakakis (1988).
Joint work with Eyal Kushilevitz
The direct product code encodes the truth table of a function f:U-->R
by a function f^k:U^k -->R^k, which lists the value of f in all k-tuples
of inputs. We study its (approximate) proximity testing to this code in
the "list decoding" regime.
We give a 3-query tester for distance 1-exp(-k) (which is impossible
with 2 queries). We also give a 2-query tester for distance 1-1/poly(k)
for a derandomized version of this code (of polynomial rate).
We then use the same techniques to reduce soundness error in 2-query PCPs,
which gives incomparable decay to the Parallel Repetition Theorem.
Joint work with Valentine Kabanets and Russell Impagliazzo.
Locally Decodable Codes (LDC) allow one to decode any particular
symbol of the input message by making a constant number of queries
to a codeword, even if a constant fraction of the codeword is
damaged. In a recent work \cite{Yekhanin08} Yekhanin constructs a
$3$-query LDC with sub-exponential length of size
$\exp(\exp(O(\frac{\log n}{\log\log n})))$. However, this
construction requires a conjecture that there are infinitely many
Mersenne primes. In this paper we give the first unconditional constant
query LDC construction with subexponantial codeword length. In addition
our construction reduces codeword length. We give construction of
\(3\)-query LDC with codeword length $\exp(\exp(O(\sqrt{\log n \log \log n })))$.
Our construction also could be extended to higher number of queries.
In a recent breakthrough, Moshkovitz and Raz [MR] proved that NP can be
verified using two queries, sub-constant error, and nearly linear
proof length. The main technical component of their result is a new
composition theorem for certain specific 2-query PCPs. All earlier
composition theorems suffered from incurring an additional query per
composition.
We abstract their proof and prove a generic 2-query PCP composition
theorem with low error. More formally, we define a certain (natural)
type of PCP verifier and show how to compose such verifiers without
incurring an extra query. Our composition works regardless of the way
the verifiers themselves are constructed.
This composition theorem is then used to give a simpler
and modular proof of the results of [MR] mentioned above. This is done
by an iterative application of the composition theorem, using known
components (namely, the "folklore" manifold vs. point construction).
Our construction suffers from the same bottleneck as [MR] with respect
to the error probability, in that, it is inverse polynomial (not
exponential) in the alphabet size
[Joint work with Irit Dinur]
The widely held belief that BQP strictly contains BPP raises fundamental
questions: Upcoming generations of quantum computers might already be
too large to be simulated classically. Is it possible to experimentally
test that these systems perform as they should, if we cannot efficiently
compute predictions for their behavior? Vazirani has asked [Vaz07]: If
computing predictions for Quantum Mechanics requires exponential resources,
is Quantum Mechanics a falsifiable theory? In cryptographic settings, an
untrusted future company wants to sell a quantum computer or perform a
delegated quantumcomputation. Can the customer be convinced of correctness
without the ability to compare results to predictions?
To provide answers to these questions, we define Quantum Prover Interactive
Proofs (QPIP). Whereas in standard Interactive Proofs [GMR85] the prover is
computationally unbounded, here our prover is in BQP, representing a quantum
computer. The verifier models our current computational capabilities: it is a
BPP machine, with access to few qubits. Our main theorem can be roughly stated as:
"Any language in BQP has a QPIP, and moreover, a fault tolerant one" (providing a
partial answer to a challenge posted in [Aar07]). We provide two proofs. The simpler
one uses a new (possibly of independent interest) quantum authentication scheme (QAS)
based on random Clifford elements. ThisQPIP however, is not fault tolerant. Our second
protocol uses polynomial codes QAS due to Ben-Or, Crepeau, Gottesman, Hassidim, and
Smith [BOCG+ 06], combined with quantum fault tolerance and secure multiparty quantum
computation techniques. A slight modification of our constructions makes the protocol
"blind": the quantum computation and input remain unknown to the prover.
In this work we study the list-decoding size of Reed-Muller codes.
Given a received word and a distance parameter, we are interested in
bounding the size of the list of Reed-Muller codewords that are within
that distance from the received word. We provide asymptotic bounds for
the list-decoding size of Reed-Muller codes that apply to all distances.
Previous results of Gopalan, Klivans and Zuckerman apply to distances
only up to the minimum distance of the code.
Additionally, we study the weight distribution of Reed-Muller codes.
We provide bounds for the weight distribution of Reed-Muller codes that
apply to all distances. Previous results by Azumi, Kasami and Tokura apply
to distances only up to 2.5 times the minimum distance of the code.
Joint work with Tali Kaufman.
An affine disperser over a field F for sources of dimension d is a function
f: F^n -> F that is nonconstant (i.e., takes more than one value) on inputs
coming from a d-dimensional affine subspace of F^n.
Affine dispersers have been considered in the context of deterministic extraction
of randomness from structured sources of imperfect randomness, and in particular,
generalize the notion of dispersers for bit-fixing sources. Previously, explicit
constructions of affine dispersers were known for every d=\Omega(n), due to
[Barak, Kindler, Shaltiel, Sudakov and Wigderson] and [Bourgain].
(The latter in fact gives stronger objects called affine extractors).
In this work we give the first explicit affine dispersers for sublinear dimension.
Specifically, our dispersers work even when d> n^{4/5}. The main novelty in our
construction lies in the method of proof, which relies on elementary properties of
subspace polynomials. In contrast, the previous works mentioned above relied on
sum-product theorems for finite fields.
Joint work with Swastik Kopparty.
In the affine rank minimization problem, the goal is to minimize the rank of a matrix subject to a set of affine constraints. This is in general an NP-hard problem. We study a convex relaxation of this problem where the rank objective function is replaced by the gamma_2 norm. The gamma_2 norm can be viewed as a weighted version of the trace norm and can be expressed as a semidefinite program. We show that, given certain conditions on the constraint matrices, if the mimimal rank solution of the original problem over n-by-n matrices is r, via the gamma_2 norm relaxation one can find a rank O(rlog(n)) matrix which approximately satisfies the constraints. Our main motivation and application comes from communication complexity. The approximation rank of a sign matrix A is the minimum rank of a real matrix which is entrywise close to A. Approximation rank is one of the strongest lower bound techniques available for randomized and quantum communication complexity, yet can be quite difficult to evaluate in practice. We show that approximation rank and the analogous approximation version of the gamma_2 norm are polynomially related, and thus are essentially equivalent for the purposes of communication complexity. This is joint work with Adi Shraibman.
Using ideas from automata theory we design a new efficient
(deterministic) identity test for the noncommutative
polynomial identity testing problem.
More precisely, given as input a noncommutative
circuit C computing a polynomial of degree d in the noncommutative ring
F{x_1,x_2,...,x_n} with at most t monomials,
where the variables x_i are noncommuting, we give a deterministic
polynomial identity test that checks if C=0 and runs in time
polynomial in d, n, size(C), and t. Prior to our work, the only
known deterministic polynomial time identity testing algorithm in the
noncommutative model was for noncommutative formulas (Raz and Shpilka 2005).
If fact we show similar ideas can be used to design a deterministic
polynomial time interpolation algorithm for such polynomials.
Extending the automata theoretic ideas, we design a new randomized
polynomial time algorithm for noncommutative circuit computing
small degree polynomial. This algorithm is based on isolation
lemma and conceptually quite
different from the existing algorithm due to Bogdanov and Wee (2005).
Using this algorithm, we show that derandomization of a particular
instance of isolation lemma yields circuit lower bound in
noncommutative model. This is similar in flavour to the
Impagliazzo-Kabanets (2003) result for commutative case.
We further show that the derandomization of a stronger version
of an instance of isolation lemma implies that an explicit multilinear
polynomial in usual commutative model does not have subexponential
size arithmetic circuit. This result is similar in flavor to the
Manindra Agrawal's (2005) result that a black-box derandomization
of polynomial identity testing for a class of arithmetic circuit via
pseudorandom generator will show similar lower bound.
Joint work with V. Arvind and S. Srinivasan.
Locally testable codes (LTCs) are error-correcting codes for which
membership, in the code, of a given word can be tested by examining
it in very few locations. Most known constructions of locally
testable codes are linear codes, and give error-correcting codes
whose duals have (superlinearly) {\em many} small weight
codewords. Examining this feature appears to be one of the promising
approaches to proving limitation results for (i.e., upper bounds on
the rate of) LTCs.
Unfortunately till now it was not even known if LTCs need to be
non-trivially redundant, i.e., need to have {\em one} linear
dependency among its low-weight codewords. In this paper we give
the first lower bound of this form, by showing that every positive
rate constant query LTC must have linearly many redundant low-weight
codewords in its dual. We actually prove the stronger claim that the
{\em actual test itself} must use $\Omega(n)$ redundant dual
codewords (beyond the minimum number of basis elements required to
characterize the code); in other words, non-redundant (in fact, low
redundancy) local testing is impossible.
Joint work with Eli Ben-Sasson, Venkatesan Guruswami, Tali Kaufman and
Madhu Sudan.
Say we place m balls sequentially into n bins, where each ball is placed by
randomly selecting d bins and placing it in the least loaded of them. What is
the load of the maximum bin when m>>n? It is well known that when d=1 the
maximum load is m/n + \tildeO(sqrt(m/n)), whereas when d>=2 the maximum load is
m/n + loglog n. Thus when more than one bin is sampled, the gap between maximum
and average does not increase with the number of balls. We investigate a larger
family of placement schemes. We focus on the case where each time a ball is
placed, with probability half we use d=1 and with probability half we use d=2.
We show that the maximum load for this process is m/n + log n. Thus,
surprisingly, even though half the balls are thrown using a one-choice scheme,
the gap between maximum load and average load does not increase with the number
of balls. Along the way we develop new proof techniques which greatly simplify
existing proofs and show general conditions under which a placement scheme
outputs a balanced allocation.
Joint work with Kunal Talwar and Yuval Peres