Technical Report PHD-2021-09

Title: Automata over Infinite Data Domains: Learnability and Applications in Program Verification and Repair
Authors: Hadar Frenkel
Supervisors: Orna Grumberg and Sarai Sheinvald
PDFCurrently accessibly only within the Technion network
Abstract: This thesis focuses on different aspects of formal verification of systems over infinite data domains. Applications for such systems can be found in communication systems, e-commerce systems, large databases, and more. A widely used approach in program verification is model-checking. The problem of model-checking is, given a system and a specification, to determine whether the system satisfies the specification (and thus the verification succeeds); or that the system violates the specification (thus verification fails) and some error is found, witnessing the violation. We focus on the automata-theoretic approach to model-checking. In this approach, both the system and the specification are modeled as finite automata, and model-checking is then reduced to reasoning about these automata. However, real-life systems often contain infinitely many different configurations, as they refer to the data in the system, which is unbounded. In this case, the model-checking of such systems does not scale well, and may even become undecidable.

In this thesis, we use finite-state automata in order to model different types of such systems. First, we consider ongoing systems over infinite data domains, with respect to temporal specifications. An example of such a system is a server that communicates with an unknown number of clients. We propose a new automaton model that is able to capture the fragment of $\exists^*$-Variable LTL, which is an extension of LTL that allows reasoning about infinite data domains. We use this model to suggest a bounded model-checking algorithm for such systems, and we characterize decidable fragments of the logic, for which we suggest a complete model-checking~process.

Next, we consider the model of communicating systems, which is most suitable to model communication and security protocols. We exploit the partition of the system into smaller components (e.g. server and clients). We also use the finite automata representation, to suggest a modular verification and repair algorithm that is based on automata learning using the $L^*$ algorithm.

Finally, we consider symbolic automata, whose alphabet is the set of predicates over some Boolean algebra. We study the $L^*$ algorithm in the context of symbolic automata, since, as we demonstrate, it is widely used in program verification. Next, we study a different learning paradigm, namely, identification is the limit. To the best of our knowledge, this is the first time that this paradigm is considered in the context of infinite data domains. We suggest an automata learning algorithm for systems with data over the natural or real numbers, and present some complexity results regarding the learnability of symbolic automata under this paradigm.

CopyrightThe above paper is copyright by the Technion, Author(s), or others. Please contact the author(s) for more information

Remark: Any link to this technical report should be to this page (, rather than to the URL of the PDF files directly. The latter URLs may change without notice.

To the list of the PHD technical reports of 2021
To the main CS technical reports page

Computer science department, Technion