Aviv Yehezkel, Ph.D. Thesis Seminar
Wednesday, 28.5.2014, 12:30
Cardinality estimation algorithms receive a stream of elements that may appear in arbitrary order, with possible repetitions, and return the number of distinct elements.
Such algorithms usually seek to minimize the required storage at the price of inaccuracy in their output.
In this talk we study the weighted generalization of the cardinality estimation problem, where each item is associated with a weight and the goal is to estimate the total sum of weights.
We show how to generalize every cardinality estimation algorithm that relies on extreme order statistics (min/max sketches) to a weighted version.
The proposed unified scheme uses the unweighted estimator as a black-box, and manipulates the input using properties of the beta distribution.