A q-query Locally Decodable Code (LDC) encodes an n-bit message x as an N-bit codeword C(x), such that one can probabilistically recover any bit x_i of the message by querying only q bits of the codeword C(x), even after some constant fraction of codeword bits has been corrupted. Yekhanin give new constructions of three query LDCs of vastly shorter length than that of previous constructions. Specifically, given any Mersenne prime p=2^t-1, he designs three query LDCs of length N=EXP(n^{1/t}), for every $n$. Based on the largest known Mersenne prime, this translates to a length of less than EXP(n^{10^{-7}}), compared to EXP(n^{1/2}) in the previous constructions. It has often been conjectured that there are infinitely many Mersenne primes. Under this conjecture, his constructions yield three query locally decodable codes of length N=EXP(n^{1/log log n}) for infinitely many n. He also obtain analogous improvements for Private Information Retrieval (PIR) schemes. Previous families of LDCs and PIR schemes were based on the properties of low-degree multivariate polynomials over finite fields. His constructions are completely different and are obtained by constructing a large number of vectors in a small dimensional vector space whose inner products are restricted to lie in an algebraically nice set.
We show combinatorial limitations on efficient list decoding of
Reed-Solomon codes beyond the Johnson and Guruswami-Sudan bounds. In
particular, we show that for arbitrarily large fields F_N,
for any \delta \in (0,1), and K = N^\delta:
Existence: there exists a received word that agrees with a
super-polynomial number of distinct degree K polynomials on
approximately N^{\sqrt{\delta}} points each;
Explicit: there exists a polynomial time constructible
received word that agrees with a super-polynomial number of distinct
degree K polynomials, on approximately 2^{\sqrt{\log N}}.K
points each.
In both cases, our results improve upon the previous state of the
art, which was about N^\delta/\delta for the existence case and
about 2N^\delta for the explicit one. Furthermore, for \delta
close to 1 our bound approaches the Guruswami-Sudan bound (which
is \sqrt{NK}) and implies limitations on extending their efficient
RS list decoding algorithm to larger decoding radius.
Our proof method is surprisingly simple. We work with polynomials
that vanish on subspaces of an extension field viewed as a vector
space over the base field. These subspace polynomials are a subclass
of linearized polynomials that were studied by Ore in the 1930s and
by coding theorists. For us their main attraction is their sparsity
and abundance of roots, virtues that recently won them pivotal roles
in probabilistically checkable proofs of proximity and sub-linear
proof verification.
This paper initiates a study of randomness extraction from low
degree sources, namely from distributions generated by
multivariate polynomials of low degree over finite fields. This
naturally generalizes previous work on extraction from affine
sources (degree 1 polynomials).
We establish some basic connections between the min-entropy of
such sources and the algebraic rank of the given polynomials
(namely the largest number of algebraically independent ones). We
proceed to give explicit rank extractors, which on any set
of n polynomials with rank k deterministically output k
algebraically independent polynomials. This construction may be of
independent interest, and works also over fields of characteristic
zero. We then use the connection above to show that these rank
extractors give deterministic condensers in the usual sense of
min-entropy. Another interesting consequence is a deterministic
condenser for polynomial size arithmetic circuits over large
fields.
Joint work with Ariel Gabizon and Avi Wigderson.
We show that unless $NP subseteq RTIME(2^{\poly(\log{n})}), for any \eps>0 there is no polynomial-time algorithm approximating the Shortest Vector Problem (SVP) on n-dimensional lattices in the l_p norm (1 \leq p<\infty) to within a factor of 2^{(\log{n})^{1-\eps}}. This improves the previous best factor of 2^{(\log{n})^{1/2-\eps}} under the same complexity assumption due to Khot [Khot05]. Under the stronger assumption NP \not \subseteq RSUBEX, we obtain a hardness factor of n^{c/\log\log{n}} for some c>0.
Our proof starts with SVP instances from [Khot05] that are hard to approximate to within some constant. To boost the hardness factor we simply apply the standard tensor product of lattices. The main novel part is in the analysis, where we show that the lattices of [Khot05] behave nicely under tensorization. At the heart of the analysis is a certain matrix inequality which was first used in the context of lattices by de Shalit and Parzanchevski.
No previous knowledge will be assumed. Joint work with Oded Regev, Tel-Aviv University.
An *interactive-PCP* (say, for membership in L) is a proof that can be verified by reading only *one* of its bits, with the help of a very short (say, poly-logarithmic size) interactive-proof. While every PCP can be viewed as an interactive-PCP, we show that for membership in some languages L, there are interactive-PCPs that are significantly shorter than the known (non-interactive) PCPs for these languages.
Our main result is that the satisfiability of a constant depth Boolean formula Phi(x_1,...,x_k) of size n (over the gates NOT, OR, AND, XOR) can be proved by an interactive-PCP of size poly(k). That is, we obtain interactive-PCPs of size polynomial in the size of the *witness*. This compares to the known (non-interactive) PCPs that are of size polynomial in the size of the instance. By reductions, the result extends to many other central NP languages.
One application of our result is that the satisfiability of a constant depth formula Phi(x_1,...,x_k) of size n (as above) can be proved by an interactive zero-knowledge proof of size poly(k) (rather than poly(n)). As before, the result extends to many other central NP languages.
This is joint work with Ran Raz.
Coloring a k-colorable graph using k-colors (k > 2) is a notoriously hard problem. Considering average case analysis allows for better results. In this work we consider the uniform distribution over k-colorable graphs with a linear number of edges. We rigorously analyze the structure of the space of proper k-colorings of a random graph in that distribution, showing that basically all proper k-colorings are clustered in one cluster, and agree on all but a small, though constant, number of vertices. We also show that some polynomial time algorithm can find a proper k-coloring of a random $k$-colorable graph whp, thus asserting that most such graphs are easy.
This should be contrasted with the setting of very sparse random graphs (which are k-colorable whp), where experimental results show some regime of edge density to be difficult for many coloring heuristics. One explanation for this phenomena, backed up by partially non-rigorous analytical tools from statistical physics, is the complicated clustering of the solution space at that regime, unlike the more "regular" structure that denser graphs posses. Thus in some sense, our result rigorously supports this explanation.
Our results also apply to the analogous 3SAT setting.
Joint work with A. Coja-Oghlan and M. Krivelevich.Locally decodable codes (LDCs) have played a central role in many recent results in theoretical computer science. The role of finite fields, and in particular, low-degree polynomials over finite fields, in the construction of these objects is well studied. However the role of group homomorphisms in the construction of such codes is not as widely studied. Here we initiate a systematic study of local decoding of codes based on group homomorphisms. We give an efficient list decoder for the class of homomorphisms from any abelian group G to a fixed abelian group H. The running time of this algorithm is bounded by a polynomial in log|G| and an agreement parameter, where the degree of the polynomial depends on H. Central to this algorithmic result is a combinatorial result bounding the number of homomorphisms that have large agreement with any function from G to H. Our results give a new generalization of the classical work of Goldreich and Levin, and give new abstractions of the list decoder of Sudan, Trevisan and Vadhan.
Based on joint with Elena Grigorescu and Madhu Sudan
Several results, which turned out to be quite useful in computer science, show that a Boolean valued function f over the discrete hypercube whose Fourier representation is 'mostly' of small degree, must be 'almost' completely determined by a small number of input-coordinates.
Since one often must deal with averages of Boolean functions (which are not Boolean valued) it is natural to ask if similar phenomena occur for such functions. We'll answer this question positively, albeit the reasons for this answer and the parameters involved are different from the Boolean case. The main tool used is a new large-deviation lower bound for low degree functions. If time permits, we may also discuss some related hardness-of-approximation results. Joint work with Irit Dinur, Ehud Friedgut, and Ryan O'Donnell.In the 1950's, Hannan gave an algorithm for playing repeated games. His algorithm had nice "no regret" properties and relied on the ability of the player to compute a best response to the opponent's action. In this paper, we generalize the results to the more difficult case where a player is playing a complex game for which s/he can only compute approximate best responses. In doing so, we need to use different algorithmic ideas.
We also consider applications to NP-hard "structured" multi-armed bandit problems. An example is a traveling salesman who repeatedly must choose a route to visit a set of cities. Each period, she chooses a route and then observes the total time that it took. Her goal is to have low average time. Of course, this online repeated decision-making problem is at least as hard as the standard offline (one-period) TSP problem. We show that it is not harder: given an efficient alpha-approximation for this problem (or any linear combinatorial optimization problem), we can efficiently solve the repeated online version described above, achieving average time nearly as good as alpha OPT, where OPT is the average cost of the best single path chosen with hindsight knowledge of times on all edges during all periods.
A fundamental goal of computational complexity (and foundations of cryptography) is to base average-case hardness for NP on worst-case hardness of NP. There has been a long line of research trying to explain our failure in proving such reductions. The bottom line of this research is essentially that (under plausible assumptions) non-adaptive Turing reductions cannot prove such results.
We revisit the problem. Our first observation is that the above mentioned negative arguments extend to a non-standard notion of average-case complexity, in which the distribution on the inputs with respect to which we measure the average-case complexity of the function is only samplable in super-polynomial time. The significance of this result stems from the fact that in this non-standard setting, Gutfreund et al. did show a worst-case/average-case connection. In other words, their technique gives a way to bypass the impossibility arguments. I will explain the GST result in the talk. By taking a closer look at their proof we discover that the worst-case/average-case connection is proven by a reduction that "almost" falls under the category ruled out by the negative result. This gives rise to an intriguing new notion of almost black-box reductions.
We then ask whether the positive result can be extended to the standard setting, to prove some new worst-case/average-case connection. While we can not do that unconditionally, I will show that under a mild derandomization assumption, the worst-case hardness of NP implies the average-case hardness of NTIME(f(n)) (under the uniform distribution) where f is computable in quasi-polynomial time. My talk is based on two papers, one together with Dan Gutfreund and Ronen Shaltiel and one together with Dan.In 1984 Levin put forward a suggestion for a theory of average case complexity. In this theory a problem, called a distributional problem, is defined as a pair consisting of a decision problem and a probability distribution over the instances. Introducing adequate notions of "easiness on the average", simple distributions and average case preserving reductions, Levin developed a theory analogous to the theory of NP-completeness. In particular, he showed that there exists a simple distributional problem that is complete under these reductions. But since then very few distributional problems were shown to be complete in this sense. In this talk we show a simple sufficient condition for an NP-complete decision problem to have a distributional version that is complete under these reductions (thus showing that such problem is hard on the average under some simple distribution). Apparently all known NP-complete decision problems meet this condition. The talk will begin with a short introduction to the theory of average-case complexity.
Program checking, program self-correcting and program self-testing were pioneered by [Blum and Kannan] and [Blum, Luby and Rubinfeld] in the mid eighties as a new way to gain confidence in software, by considering program correctness on an input by input basis rather than full program verification.
In this work we prove a new composition theorem in the field of program checking, which essentially shows how to build program checkers that delegate their work to the (potentially faulty) program that is being checked, thus improving the checker's efficiency. This composition theorem enables us to address several central issues in the area of program checking and to prove a multitude of results. We highlight two of our main contributions:
* We show checkers (testers and correctors) for a large class of functions, that are {\em provably} more efficient, in terms of circuit depth, than the optimal program for the function. This is done by relating (for the first time) the complexity of checking to the complexity of computing.
* Blum et.al. considered a non-standard (and weaker) notion of program checkers (testers and correctors) for functions that belong to a {\em library} of several related functions. They showed checkers, testers and correctors in this notion, for a library of matrix functions containing multiplication, inversion, rank and determinant. We show how to eliminate the need for this library, and instead we show standard checkers, testers and correctors for all of the above mentioned matrix functions.
Joint work with Shafi Goldwasser, Alex Healy, Tali Kaufman and Guy Rothblum.
We consider the polynomial noisy interpolation problem of recovering an unknown polynomial f(X) \in F_p[X] with at most w non-zero terms and of almost arbitrary degree from approximate values of f(t) at polynomially many points t \in F_p selected uniformly at random. This method is based on some number theory tools such as bounds of exponential sums and lattice reduction algorithms. In particular it is motivated by new advances in the Waring problem.
Time permits, we also consider a CRT version of this problem when an unknown integer A should be recovered from approximations to the residues A mod p for randomly selected primes p.
Joint work with Arne Winterhof (polynomial part) and Ron Steinfeld (CRT part).
A distributed consensus algorithm allows n
processes to reach a common
decision value starting from individual inputs. An algorithm in an
asynchronous shared-memory system is wait-free if a process terminates
within a finite number of its own steps regardless of scheduling.
From the FLP result, wait-free consensus is impossible. However,
consensus
becomes solvable using randomization when a process only has to terminate
with probability 1.
Randomized consensus algorithms are typically evaluated by their total step
complexity, which is the expected total number of steps taken by all
processes.
This work proves that the total step complexity of randomized consensus is
Theta(n^{2}) in an asynchronous shared memory
system using multi-writer
multi-reader registers. The bound is achieved by improving both the
lower
and the upper bounds for this problem.
The best previously known lower bound is improved by a factor of log^{2}(n).
It is achieved through restricting attention to a set of layered executions
and using an isoperimetric inequality for analyzing their behavior.
The matching algorithm decreases the expected total step complexity by a
log(n) factor, by leveraging the multi-writing capability of the shared
registers. Its correctness proof is facilitated by viewing each execution of
the algorithm as a stochastic process and applying Kolmogorov's inequality.
(This is joint work with Hagit Attiya.)
Consider a pair of correlated random variables (X,Y), each taking
values in the set {0,1}^n. Let D be the marginal distribution of
X, and D_x the conditional distribution of Y given X=x. We
consider the following communication problem between two parties,
Alice and Bob, both of whom know the joint distribution (X,Y).
Alice will be given a random input x distributed according to D.
She needs to transmit a message to Bob so Bob can generate a value
in {0,1}^n with distribution D_x. For example, Alice can send x
across, and then Bob can pick a random value from the distribution
D_x. This protocol, however, can be wasteful. For example, if X
and Y are independent, Bob could have generated his value without
any help from Alice. Let C(X:Y) be the minimum expected number of
bits that Alice must transmit. We show that if Alice and Bob share
a random string (independent of Alice's input), then
I[X:Y] <= C(X:Y) <= I[X:Y] + O(log I[X:Y]),
where I[X:Y] = H[X] + H[Y] - H[XY] is the mutual information
between the random variables X and Y.
Before this work, a limit version of this result was known. Our
results are for the one-shot case, which imply the limit versions.
An important application of our result is improved direct sum
result in communication complexity, for which the limit versions
do not seem to help.
An essential ingredient in our proofs is a rejection sampling
procedure that relates the relative entropy between two
distributions to the communication complexity of generating one
distribution from the other.
[Joint Work with Rahul Jain, David McAllester and Jaikumar Radhakrishnan]
We construct an explicit polynomial f(x_1,...,x_n), with coefficients in {0,1}, such that the size of any syntactically multilinear arithmetic circuit computing f is at least Omega( n^{4/3} / log^2(n) ). The lower bound holds over any field.
We consider the question of whether adaptivity can improve the complexity of property testing algorithms in the {\em dense graphs\/} model. Goldreich and Trevisan have shown that there can be at most a quadratic gap between adaptive and non-adaptive testers in this model, but it was not known whether any gap indeed exists. In this work we reveal such a gap.
Specifically, we focus on the well studied property of {\em bipartiteness}. Bogdanov and Trevisan proved a lower bound of $\Omega(1/\eps2)$ on the query complexity of {\em non-adaptive\/} testing algorithms for bipartiteness, where this lower bound holds for graphs with maximum degree $O(\eps n)$. Here we describe an {\em adaptive\/} testing algorithm for bipartiteness of graphs with maximum degree $O(\eps n)$ whose query complexity is $\tilde{O}(1/\eps^{3/2})$. This demonstrates that adaptive testers are stronger than non-adaptive testers in the dense graphs model.
We note that the upper bound we obtain is tight up-to polylogarithmic factors, in view of the $\Omega(1/\eps^{3/2})$ lower bound of Bogdanov and Trevisan for {\em adaptive\/} testers. In addition we show that $\tilde{O}(1/\eps^{3/2})$ queries also suffice when (almost) all vertices have degree $\Omega(\sqrt \eps \cdot n)$. In this case adaptivity is not necessary.
Using cryptographic techniques, it is possible to design a fair voting
system whose correct operation can be verified by anyone, while still
retaining ballot secrecy. Such voting schemes are called "Universally
Verifiable". If, in addition, the voting scheme prevents vote buying
and coercion, we say it is "receipt-free".
We present the first universally verifiable voting scheme that can be
based on a general assumption (existence of a non-interactive commitment
scheme). Our scheme is also the first receipt-free scheme to give ``everlasting
privacy'' for votes: even a computationally unbounded party does not gain any
information about individual votes (other than what can be inferred from the final
tally).
Our voting protocols are designed to be used in a ``traditional'' setting, in
which voters cast their ballots in a private polling booth (which we model
as an untappable channel between the voter and the tallying authority).
Following in the footsteps of Chaum and Neff, our protocol ensures that
the integrity of an election cannot be compromised
even if the computers running it are all corrupt
(although ballot secrecy may be violated in this case).
We give a generic voting protocol which we prove to be secure in the
Universal Composability model, given that the underlying commitment is
universally composable.
We also propose a concrete implementation, based on the hardness of
discrete log, that is slightly more efficient.
This is joint work with Moni Naor
An urn contains n-sided dice. Each has a
particular probability
distribution on its faces. Your goal is to characterize the contents
of the urn, a probability distribution on the n-simplex. To this end
you can reach in and extract a random die; you may then roll that
die a small number of times. You can repeat this experiment, with
replacement, many times.
The limit on the number of times each die may be rolled, which we
call the sampling aperture, is the key constraint in our work. We can
not adequately measure the properties of any single die.
We give algorithms that infer accurate estimates of urns supported
on an unknown line, using constant sampling aperture and reasonable
sample size.
This is joint work with Leonard J. Schulman and Chaitanya Swamy.
Suppose you are studying the occurrence of a
disease and need
to uncover any salient statistical properties that might hold.
For example, it would be important to know if the probability
of contracting the disease decreases with distance of your house
from a given nuclear plant, whether the distribution on zipcodes
of patients is close to the distribution for another disease, or
whether a person's likelihood of contracting it is correlated
with their profession. Of course, you wish to notice such trends
given as few samples as possible.
We survey several works regarding the complexity of testing
various global properties of distributions, when given access to only
a few samples from the distributions. We will focus on properties
that give some indication on the independence of an underlying
joint distribution. We will describe algorithms whose sample
complexities are *sublinear*
in the size of the support for many such
testing problems. In contrast, classical techniques, such as the
Chi-squared test or the straightforward use of Chernoff bounds,
have sample complexities that are at least linear in the size
of the support of the underlying discrete probability distributions.
Are there new quantum algorithms beyond
Factoring?
In this talk I will survey a very intriguing new direction in
quantum algorithms: Additive Approximations of #P hard problems.
We get approximations to problems related to topology (the Jones
polynomial), statistical physics (the Potts model), and more (the Tutte
polynomial..). These new algorithms are not based on the
usual Fourier transform: instead, they encode the local structure
of the problem into a beautiful pictorial object known as the Temperley
Lieb algebra. Moreover, it turns out that many of these new algorithms are
complete for quantum polynomial time!
No prior knowledge will be assumed. The talk is based on several joint
works with Arad, Eban, Jones and Landau.