Technical Report CS0072

Title: A Linear Algorithm for Finding Repetitions and Its Applications in Data Compression
Authors: M.Rodeh, V.R. Pratt and S. Even
Abstract: A linear algorithm for finding the longest repetition for each position in a given string is developed. It is related to Werner's string processing algorithm [l] but simpler. The algorithm is applied to the problem of evaluating the complexity of a finite string as defined by Lempel and Zlv [2]. Using Ellas' representation of the Intesers [4.5] we produee an asymptotically optimal variable to variable coding scheme for sequential data compression based on the definition of complexity. The scheme requires unbounded memory for strings whose length grows to Inftnlty. To overcome this difficulty Zlv and Lempel [3] Invented a variable to block coding scheme that Is asymptotically optimal for all ergodic sources with a given finite entropy. We Implement their scheme by combining our repetition finder with Wetner's suffix tree construction algorithm. Other related and more practical schemes are described too.
CopyrightThe above paper is copyright by the Technion, Author(s), or others. Please contact the author(s) for more information

Remark: Any link to this technical report should be to this page (, rather than to the URL of the PDF files directly. The latter URLs may change without notice.

To the list of the CS technical reports of 1976
To the main CS technical reports page

Computer science department, Technion