Research

A Logical Treatment of Finite Automata

Abstract

This paper, authored by Nishant Rodrigues, Mircea Sebe, Xiaohong Chen, and Grigore Rosu, presents a novel logical treatment of finite automata and regular languages within the framework of matching logic. Published at TACAS 2024 (Tools and Algorithms for the Construction and Analysis of Systems), the work provides a sound and complete axiomatization of finite words that yields shallow embeddings of both regular expressions and finite automata directly as matching logic formulae. The key innovation is that the semantics of expressions and automata are precisely captured as matching logic patterns that evaluate to the corresponding language, eliminating the need for deep encodings or external semantic machinery. This approach demonstrates that matching logic is not merely a theoretical curiosity but a practical foundation for reasoning about fundamental computational structures.

Axiomatizing Finite Words in Matching Logic

The paper begins by establishing the matching logic framework for reasoning about finite words. A matching logic theory of finite words is constructed, where the alphabet symbols $a, b, \ldots$ and the empty word $\epsilon$ serve as the basic building blocks. The word structure is axiomatized using matching logic patterns, where concatenation of symbols is represented through the application construct of matching logic. The axiomatization includes axioms ensuring that words are built from a finite alphabet, that the empty word $\epsilon$ is a unit for concatenation ( $\epsilon \cdot w = w = w \cdot \epsilon$ ), and that distinct symbols produce distinct words. Crucially, the axiom of no junk ( $\forall x.\, x = \epsilon \lor \bigvee_{a \in \Sigma} \exists y.\, x = a \cdot y$ ) ensures that every element in the model is a finite word, and the axiom of no confusion ( $a \cdot w_1 = b \cdot w_2 \to a = b \wedge w_1 = w_2$ ) ensures that distinct constructors produce distinct elements. These axioms together characterize the free algebra of finite words up to isomorphism, providing the semantic foundation for all subsequent results.

Regular Expressions as Matching Logic Patterns

Regular expressions are matching logic formulae as is, requiring no encoding or translation. This shallow embedding preserves the compositional structure of regular expressions while giving them precise logical semantics.

A unique feature of the axiomatization is its treatment of regular expressions as matching logic patterns. The empty set corresponds to $\bot$ (bottom), the empty word $\epsilon$ corresponds to its matching logic symbol, and singleton letters $a$ correspond to their respective symbols. Union of regular expressions corresponds to disjunction ( $\lor$ ), concatenation of regular expressions $r_1 \cdot r_2$ corresponds to the matching logic pattern expressing the set of all words that can be split into a prefix matching $r_1$ and a suffix matching $r_2$ , and the Kleene star $r^*$ is captured using the least fixpoint operator $\mu X . (\epsilon \lor r \cdot X)$ . This direct correspondence means that no translation or compilation step is needed: a regular expression is already a well-formed matching logic formula whose semantics is exactly the language it denotes. The expressiveness of matching logic, particularly its fixpoint operators, is essential for capturing the Kleene star without resorting to external recursive definitions.

Finite Automata as Matching Logic Formulae

The paper provides an equally elegant treatment of finite automata as matching logic formulae. A finite automaton with states $q_0, q_1, \ldots, q_n$ , transitions $\delta$ , and accepting states $F$ is represented as a matching logic pattern where the computational aspects of the automaton are captured as syntactic features. Specifically, the transition function is encoded using implications and disjunctions: for each state $q$ and symbol $a$ , the transition $\delta(q, a) = q'$ becomes a matching logic axiom. The accepting condition is expressed as a disjunction over accepting states, and the language $L(q)$ of the automaton from a given state $q$ is defined as a greatest fixpoint $\nu X.\, (\mathsf{accept}(q) \vee \bigvee_{a \in \Sigma} a \cdot X[\delta(q,a)/q])$ , capturing all words accepted starting from $q$ . This structural analog representation means that the automaton's behavior is fully determined by its matching logic encoding, and reasoning about the automaton's language reduces to reasoning in matching logic.

Brzozowski's Procedure as Matching Logic Proofs

The central technical result of the paper is the demonstration that Brzozowski's procedure for checking equivalence of regular expressions and automata corresponds to proofs in matching logic. Brzozowski's algorithm computes derivatives of regular expressions with respect to input symbols: the derivative of a regular expression $r$ with respect to a symbol $a$ , written $\partial_a(r)$ , is the regular expression denoting the set $\{w \mid aw \in L(r)\}$ . The algorithm checks equivalence by iteratively computing derivatives and checking whether the resulting expressions are equivalent, terminating when no new derivative classes are found. The paper shows that each step of this derivative computation corresponds to a valid proof step in the matching logic proof system, and that the termination of Brzozowski's procedure corresponds to a complete proof of equivalence in matching logic.

$\partial_a(r_1 \cdot r_2) = \begin{cases} \partial_a(r_1) \cdot r_2 \lor \partial_a(r_2) & \text{if } \epsilon \in L(r_1) \\ \partial_a(r_1) \cdot r_2 & \text{otherwise} \end{cases}$

Soundness and Completeness

The soundness of the axiomatization is established by showing that all axioms are valid in the intended model of finite words, and that the proof rules (including modus ponens: from $\varphi$ and $\varphi \to \psi$ , derive $\psi$ ; and the Knaster-Tarski rule: from $\varphi[\psi/X] \to \psi$ , derive $\mu X.\, \varphi \to \psi$ ) preserve validity. The completeness result is more subtle and constitutes the main technical contribution: any equivalence between regular expressions or automata that holds semantically (i.e., they denote the same language) can be proved within the matching logic proof system using the finite word axiomatization. The completeness proof proceeds by showing that the derivative-based equivalence checking procedure of Brzozowski can be systematically translated into a matching logic proof. Since Brzozowski's procedure is known to be complete for regular language equivalence, this translation yields completeness of the matching logic axiomatization for reasoning about regular languages.

From Computational Procedures to Machine-Checkable Proofs

The paper proposes this correspondence between Brzozowski's procedure and matching logic proofs as a general methodology for producing machine-checkable formal proofs. The key insight is that by capturing structural analogs of computational artifacts in logic, computational procedures for deciding properties of those artifacts can be lifted to proof procedures in the logic. This methodology extends beyond regular expressions and finite automata: any computational artifact whose semantics can be captured as a matching logic pattern, and whose properties can be decided by a procedure that operates on structural features, can benefit from this approach. The resulting proofs are machine-checkable, providing strong correctness guarantees that go beyond testing or informal reasoning.

Implications for Formal Verification and Proof Certificates

The practical implications of this work are significant for formal verification and verified computing. Since matching logic serves as the logical foundation of the K framework, the results in this paper mean that properties of regular languages and finite automata can be proved within the same framework used for specifying and verifying programming languages. Furthermore, the machine-checkable proofs generated by the methodology can serve as proof certificates that can be independently verified by a trusted proof checker. In the context of Pi2 Labs's Proof of Proof technology, these proof certificates can be further compiled into succinct cryptographic proofs, enabling zero-knowledge verification of properties about regular languages and automata. The paper thus demonstrates that matching logic's unifying power extends from programming language semantics to classical automata theory.

Conclusion

In conclusion, this paper makes a compelling case that matching logic provides a natural and powerful framework for reasoning about regular languages and finite automata. The shallow embeddings of regular expressions and automata, the sound and complete axiomatization, and the correspondence with Brzozowski's procedure together constitute a unified logical treatment that is both theoretically elegant and practically useful. The work reinforces the thesis that matching logic can serve as a universal logical foundation, capable of capturing not only programming language semantics but also the classical theory of computation. By demonstrating that well-known computational procedures correspond to logical proofs, the paper opens the door to generating verified, machine-checkable proofs for a wide class of computational properties.

Stay Informed

Learn about company and product updates, upcoming events, rewards, and more.