Time+Place: Thursday 23/11/2006 14:30 Room 337-8 Taub Bld.
Title: Universal Types, Trees, and Simulation of Individual Sequences
Speaker: Gadiel Seroussi http://www.msri.org/people/staff/gadiel/index.html
Affiliation: Mathematical Sciences Research Institute Berkeley, California
Host: Ronny Roth

Abstract:

We define the universal type class of a sequence $x^n$, in analogy
to the notion underlying the classical method of types. Two sequences
of the same length are said to be of the same universal (LZ) type if
and only if they yield the same set of phrases in the incremental
parsing of Ziv and Lempel (1978). It is shown that for any finite order 
$k$, the variational distance between the $k$th order empirical
probability distributions of two sequences of the same universal type
vanishes as the sequence length tends to infinity. Consequently,
for any $k$, and any  $k$th order probability assignment,
the difference between the normalized logarithms of the probabilities
assigned to two sequences of the same universal type also vanishes
asymptotically. We study the size of a universal type class, and
show that its asymptotic behavior parallels that of the conventional
counterpart, with the LZ78 code length playing the role of
the empirical entropy. We also study the number of universal types
for sequences of length $n$, which is shown to be of the form
$\exp((1+o(1))\gamma n/\log n)$ for a well characterized constant
$\gamma$. The problem is equivalent to a very natural tree enumeration
question, for which an answer was not available.  We describe efficient 
algorithms for enumerating the sequences in a universal type class, and 
for drawing a sequence from the class with uniform probability.
As an application, we consider the problem of universal simulation of
individual sequences. A sequence drawn with uniform probability from
the universal type class of $x^n$ is an optimal simulation of $x^n$ in
a well defined mathematical sense. We illustrate this fact by showing
the results of applying the proposed universal simulation scheme to
some texture images.