Theory Seminar: On Communication Complexity of Classification Problems

Shay Moran (University of California San Diego)

Tuesday, 12.9.2017, 13:30

Room 337 Taub Bld.

We study the two-party communication and sample complexities of classification problems.

We consider a variant of Yao's classical model in which the parties may transmit to each other examples from their input-samples rather than just bits. This enables (i) to investigate the notion of sample complexity — the total number of examples sent — which is a natural complexity measure in the context of learning theory, and (ii) to study infinite hypothesis classes.

We consider both search and decision problems.

In the context of search problems we derive exponential separations between agnostic-case and realizable-case learning, and between proper learning and learning that is not necessarily proper.

In the context of decision problems we study the realizability problem, where the parties goal is to decide whether their joint input sample is realizable with respect to some fixed hypothesis class H. For example, consider the problem of convex set disjointness, in which each party receives as a finite set of points in R^d as input and their goal is to decide whether the convex hulls of their inputs intersect. This is an instance of the realizability problem when H is the class of linear classifiers in R^d. We characterize the analogues of P, NP, and coNP for realizability problems and prove that P is the intersection of NP and coNP.

Our upper bounds are based on variants of AdaBoost, and our lower bounds are based on reductions to known results in communication complexity as well as on a direct analysis of the convex set disjointness problem.

We conclude the paper with some basic open problems, which relate this model with well studied notions such as epsilon approximations and sample compression schemes.

Joint work with Daniel Kane, Roi Livni, and Amir Yehudayoff

We consider a variant of Yao's classical model in which the parties may transmit to each other examples from their input-samples rather than just bits. This enables (i) to investigate the notion of sample complexity — the total number of examples sent — which is a natural complexity measure in the context of learning theory, and (ii) to study infinite hypothesis classes.

We consider both search and decision problems.

In the context of search problems we derive exponential separations between agnostic-case and realizable-case learning, and between proper learning and learning that is not necessarily proper.

In the context of decision problems we study the realizability problem, where the parties goal is to decide whether their joint input sample is realizable with respect to some fixed hypothesis class H. For example, consider the problem of convex set disjointness, in which each party receives as a finite set of points in R^d as input and their goal is to decide whether the convex hulls of their inputs intersect. This is an instance of the realizability problem when H is the class of linear classifiers in R^d. We characterize the analogues of P, NP, and coNP for realizability problems and prove that P is the intersection of NP and coNP.

Our upper bounds are based on variants of AdaBoost, and our lower bounds are based on reductions to known results in communication complexity as well as on a direct analysis of the convex set disjointness problem.

We conclude the paper with some basic open problems, which relate this model with well studied notions such as epsilon approximations and sample compression schemes.

Joint work with Daniel Kane, Roi Livni, and Amir Yehudayoff