Technical Report CS0086

TR#:CS0086
Class:CS
Title: A Scheme for Constructing On-line Linear Time Recognition Algorithms
Authors: Rina Cohen and Micael Cimet
PDFCS0086.pdf
Abstract: A new method for constructing 'on-line' linear time scanning algorithms is introduced. The method provides linear-time solution for some text scanning problems, such as (i) locating in a given text string the leftmost occurrence of a substring belonging to a given (possibly infinite) set of patterns L, examining no symbols beyond those in the substring, and (ii) recognizing all occurrences of patterns from L within the given text string. An algorithm which solves problem (ii) above is called a continual recognizer for L. A continual recognizer is on-line if all occurrences of patterns from L within the text string are detected on-line. The pattern set L is assumed to be prefix-free, but may be a non-regular set. Specifically, the class of prefix-free sets L for which the method applies includes all strict deterministic context-free languages as well as many non-context-free languages. For each language L in this class, a 1ineartime on-line continual recqgnizer can be constructed. The method is based on a modification of Birman and Ullman's limited backtrack top-down parsing language, called GTDPL. This is a metalanguage for specifying top~own parsing algorithms with restricted backtracking capabilities. The key construction Is a new linear-time algorithm for simulating on-line fashion a 'strictly well formed' GTDPL program. The algorithm may also be utilized for systematically combining a given set of continual linear-time recognizers for languages L1,... ,Lk into a new linear-time recognizer for any language L, which is a 'regular combination ' of L1,... ,Lk, i.e. obtained from L1,... ,Lk by applying the 'regular' operators union, concatenation and star finitely many times. This is achieved by imbedding into the GTDPL 'scheme other known types of I inear-time continual language recognizers, which are treated by the GTDPL program as subroutines.
CopyrightThe above paper is copyright by the Technion, Author(s), or others. Please contact the author(s) for more information

Remark: Any link to this technical report should be to this page (http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-info.cgi/1977/CS/CS0086), rather than to the URL of the PDF files directly. The latter URLs may change without notice.

To the list of the CS technical reports of 1977
To the main CS technical reports page

Computer science department, Technion
admin