A General Approach to Define Binders Using Matching Logic

Abstract

This paper, authored by Xiaohong Chen and Grigore Rosu and published at ICFP 2020, proposes a novel and general approach to defining binders using matching logic. In programming languages and logical systems, binders are constructs that introduce bound variables --- examples include $\lambda x.\, e$ in the $\lambda$ -calculus, $\forall X.\, T$ in System F, and $\nu x.\, P$ in the $\pi$ -calculus. Correctly capturing the binding behavior of such constructs (including $\alpha$ -equivalence, capture-avoiding substitution, and freshness) has been a longstanding challenge in formal semantics. The key insight of this paper is that matching logic already contains a built-in binder, the existential quantifier $\exists x.\, \varphi$ , and that object-level binders can be defined so that their binding behavior is directly inherited from this built-in binder. The authors demonstrate that binders in the $\lambda$ -calculus, System F, the $\pi$ -calculus, and pure type systems can all be axiomatically defined in matching logic as notations and logical theories. They prove conservative extension theorems establishing that a sequent or judgment is provable in the original system if and only if it is provable in the corresponding matching logic theory, thus showing that nothing is lost and nothing is gained by moving to matching logic.

Introduction

The problem of formally defining binders is as old as the $\lambda$ -calculus itself. When Church introduced the $\lambda$ -calculus in the 1930s, he implicitly used conventions about bound variables, $\alpha$ -equivalence (renaming of bound variables), and capture-avoiding substitution that have since been made rigorous in numerous ways. The challenge is that naive syntactic representations of binders lead to well-known problems: name clashes during substitution, the need for $\alpha$ -renaming, and the distinction between free and bound variables. Over the decades, several approaches have been developed to handle binders formally, including de Bruijn indices (which replace named variables with numeric indices), higher-order abstract syntax (HOAS) (which uses the meta-language's binding to represent object-level binding), nominal logic (which introduces a theory of names and swapping), and locally nameless representations (which use de Bruijn indices for bound variables and names for free variables).

Each of these approaches has trade-offs. De Bruijn indices eliminate $\alpha$ -equivalence issues but produce terms that are hard for humans to read. HOAS provides an elegant encoding but limits the ability to inspect the structure of binders. Nominal logic offers a principled treatment of freshness and $\alpha$ -equivalence but introduces new logical machinery (name swapping, the freshness relation $\#$ , and the $\mathbf{N}$ quantifier). The contribution of this paper is to show that matching logic offers a fundamentally different and arguably more uniform approach: binders need not be added as new primitive constructs; instead, they can be defined as notations whose binding behavior is inherited from matching logic's built-in existential quantifier $\exists x.\, \varphi$ . This approach requires no new logical infrastructure beyond what matching logic already provides, making it attractive for both theoretical foundations and practical implementations.

Background on Matching Logic

Matching logic is a logic for specifying and reasoning about structure. Its formulas, called patterns, are built from variables, symbols (drawn from a signature $\Sigma$ ), and logical connectives including the existential quantifier $\exists x.\, \varphi$ . A pattern is interpreted over a carrier set $M$ (the domain of the model), and its semantics is a subset of $M$ --- the set of elements that "match" the pattern. This is a departure from classical first-order logic, where formulas evaluate to truth values. In matching logic, the pattern $x$ denotes the singleton set $\{a\}$ when $x$ is assigned the value $a$ , and a symbol application $\sigma(\varphi_1, \ldots, \varphi_n)$ denotes the set $\sigma^M(\llbracket \varphi_1 \rrbracket, \ldots, \llbracket \varphi_n \rrbracket)$ where $\sigma^M$ is the interpretation of $\sigma$ as a function on sets (not just elements).

The key logical connectives include $\neg \varphi$ (complement), $\varphi_1 \wedge \varphi_2$ (intersection), $\varphi_1 \vee \varphi_2$ (union), and $\exists x.\, \varphi$ (existential quantification, which unions over all valuations of $x$ ). The pattern $\bot$ denotes the empty set and $\top$ denotes the full carrier set. Crucially, the existential quantifier $\exists x.\, \varphi$ is a binder: the variable $x$ is bound in $\varphi$ , and all the standard properties of binding (such as $\alpha$ -equivalence and capture-avoiding substitution) hold for $\exists x.\, \varphi$ by the definition of matching logic. The definedness symbol $\lceil \varphi \rceil$ is defined as $\lceil \varphi \rceil \equiv \neg(\varphi \leftrightarrow \bot)$ and evaluates to $\top$ if $\varphi$ is non-empty, and $\bot$ otherwise. Equality is defined as $\varphi_1 = \varphi_2 \equiv \lceil \varphi_1 \wedge \neg \varphi_2 \rceil \vee \lceil \neg \varphi_1 \wedge \varphi_2 \rceil = \bot$ . The logic has a Hilbert-style proof system that is sound and complete for its models, with rules including modus ponens (from $\varphi$ and $\varphi \to \psi$ , derive $\psi$ ), universal generalization (from $\varphi$ , derive $\forall x.\, \varphi$ ), and the Propagation axiom ( $C[\bot] = \bot$ for any context $C$ ).

Defining Binders as Matching Logic Notations

The central technical contribution of the paper is a method for defining object-level binders as notations in matching logic. The idea is to express a binder $B x.\, t$ (where $B$ is the binding construct, $x$ is the bound variable, and $t$ is the body) as a matching logic pattern that uses the existential quantifier $\exists$ to capture the binding of $x$ . Specifically, for the $\lambda$ -calculus, the $\lambda$ -abstraction $\lambda x.\, e$ is defined as a matching logic notation:

$\lambda x.\, e \;\equiv\; \mathsf{lam}(\exists x.\, (\mathsf{pair}(x, e)))$

Here, $\mathsf{lam}$ and $\mathsf{pair}$ are symbols in the matching logic signature. The existential quantifier $\exists x$ in $\exists x.\, (\mathsf{pair}(x, e))$ binds $x$ in the pair $(x, e)$ , and since the binding behavior of $\exists$ is well-defined in matching logic, the binding behavior of $\lambda x.\, e$ is automatically inherited. The pattern $\exists x.\, (\mathsf{pair}(x, e))$ can be thought of as representing the equivalence class of pairs $(x, e)$ modulo renaming of $x$ --- exactly the right semantics for $\alpha$ -equivalence. This definition ensures that $\lambda x.\, e$ and $\lambda y.\, e[y/x]$ are logically equivalent in matching logic whenever $y$ is fresh, mirroring the standard $\alpha$ -equivalence of the $\lambda$ -calculus.

The authors demonstrate that this approach scales to more complex binding constructs. For System F, the type-level universal quantifier $\forall X.\, T$ is similarly defined by using $\exists$ to bind the type variable $X$ . For the $\pi$ -calculus, the restriction operator $\nu x.\, P$ uses $\exists$ to bind the channel name $x$ . For pure type systems, which generalize many typed $\lambda$ -calculi, the dependent product type $\Pi x{:}A.\, B$ is defined using the same pattern of wrapping the bound variable in an existential quantifier. In each case, the binding behavior is inherited from matching logic's built-in binder rather than being postulated as a new primitive.

Conservative Extension Theorems

The main correctness results of the paper are conservative extension theorems for the $\lambda$ -calculus, System F, and other systems. These theorems establish a precise correspondence between provability in the original system and provability in the matching logic theory. For the $\lambda$ -calculus, the conservative extension theorem states:

A sequent $\Gamma \vdash e : \tau$ is derivable in the simply-typed $\lambda$ -calculus if and only if the corresponding matching logic pattern $\varphi_\Gamma \rightarrow (\mathsf{hasType}(e, \tau) = \top)$ is provable in the matching logic theory $\Gamma_\lambda$ .

This theorem is proved by constructing a faithful translation from $\lambda$ -calculus derivations to matching logic proofs (soundness) and from matching logic proofs back to $\lambda$ -calculus derivations (completeness). The translation maps each $\lambda$ -calculus typing rule to a corresponding derived rule in matching logic. The proof requires careful handling of variable binding, substitution, and the interaction between the object-level and meta-level quantifiers. A similar conservative extension theorem is proved for System F, where type abstraction $\Lambda X.\, e$ and type application $e\, [T]$ are defined as matching logic notations, and the typing rules of System F are derived as matching logic theorems.

The significance of these results is that they demonstrate that matching logic does not alter the deductive power of the original systems. A user working in the matching logic theory of the $\lambda$ -calculus can prove exactly the same theorems as a user working directly in the $\lambda$ -calculus. The matching logic formulation is not a mere encoding; it is a faithful representation that preserves all the reasoning power of the original system while situating it within a uniform logical framework. This uniformity is valuable for frameworks like K, where multiple languages with different binding constructs must coexist within a single formal environment.

Representational Completeness

Beyond deductive completeness (the conservative extension property), the paper establishes a stronger property for the $\lambda$ -calculus called representational completeness. A semantics of the $\lambda$ -calculus is representationally complete if every closed normal form has a unique syntactic representation. Formally, a model of the $\lambda$ -calculus is representationally complete if for every element $d$ in the carrier set that is the denotation of some closed normal form, that normal form is uniquely determined. Many existing approaches to $\lambda$ -calculus semantics fail to achieve representational completeness. For example, in set-theoretic models, the interpretation of the function space $A \to B$ is typically the set of all functions from $A$ to $B$ , which contains many elements that do not correspond to any $\lambda$ -term.

The matching logic semantics achieves representational completeness because patterns in matching logic denote subsets of the carrier set, and the axiomatization of the $\lambda$ -calculus in matching logic ensures that distinct normal forms are interpreted as distinct (singleton) subsets. The proof of representational completeness proceeds by constructing a term model (also known as a Henkin model or a syntactic model) in which the carrier set consists of equivalence classes of $\lambda$ -terms modulo $\alpha\beta\eta$ -equivalence, and showing that this model satisfies all the matching logic axioms for the $\lambda$ -calculus. In this model, every element is the denotation of a unique normal form (up to $\alpha$ -equivalence), establishing the representational completeness property. This result distinguishes the matching logic approach from many alternatives and provides additional confidence in the faithfulness of the representation.

Application to Multiple Binding Constructs

The paper systematically applies the binder-definition methodology to several binding constructs beyond the basic $\lambda$ -abstraction:

System F (polymorphic $\lambda$ -calculus): The type abstraction $\Lambda X.\, e$ is defined as $\mathsf{tyabs}(\exists X.\, \mathsf{pair}(X, e))$ , directly mirroring the pattern used for $\lambda$ -abstraction but at the type level. The type application $e\, [T]$ and the universal type $\forall X.\, T$ are defined correspondingly.
$\pi$ -calculus: The restriction operator $(\nu x) P$ , which introduces a new channel name $x$ scoped over process $P$ , is defined as $\mathsf{res}(\exists x.\, \mathsf{pair}(x, P))$ . The input prefix $x(y).P$ , which binds $y$ in $P$ , is defined similarly.
Pure type systems: The dependent product $\Pi x{:}A.\, B$ , where $x$ is bound in $B$ , is defined as $\mathsf{pi}(A, \exists x.\, \mathsf{pair}(x, B))$ . This definition handles the dependency of $B$ on $x$ through the existential quantifier.
Let-bindings and pattern matching: The let expression $\mathsf{let}\; x = e_1 \;\mathsf{in}\; e_2$ , where $x$ is bound in $e_2$ , follows the same pattern: $\mathsf{let}(e_1, \exists x.\, \mathsf{pair}(x, e_2))$ .

In each case, the conservative extension theorem is established, proving that the matching logic theory faithfully represents the original system. The uniformity of the approach --- always using $\exists$ to capture binding and proving conservative extension by the same methodology --- is a key advantage. It suggests that the approach can be applied to any binding construct in any language or logic, making it a truly general solution to the binder problem.

Comparison with Related Approaches

The paper provides a detailed comparison with existing approaches to formalizing binders. Compared to de Bruijn indices, the matching logic approach retains named variables, making definitions more readable and closer to informal mathematical practice. Compared to higher-order abstract syntax (HOAS), matching logic allows inspection of binder structure (e.g., one can write axioms that distinguish different forms of bound terms), which HOAS deliberately prevents in order to ensure adequacy. Compared to nominal logic, matching logic does not require new primitives such as name swapping or the freshness relation; instead, freshness is a derived concept, following from the standard properties of the existential quantifier.

The comparison with nominal logic is particularly illuminating. In nominal logic, $\alpha$ -equivalence is captured through a notion of equivariance under name permutations, and freshness ( $a \# t$ , meaning name $a$ does not occur free in term $t$ ) is a primitive judgment. In matching logic, $\alpha$ -equivalence is captured directly by the semantics of the existential quantifier (since $\exists x.\, \varphi$ and $\exists y.\, \varphi[y/x]$ are logically equivalent when $y$ is fresh), and freshness can be derived as $\neg(\mathsf{freevar}(x, t))$ . The paper argues that the matching logic approach is simpler because it requires no new logical infrastructure, but acknowledges that nominal logic provides more specialized reasoning principles (such as the equivariance principle) that can be convenient in certain proofs.

Conclusion

The paper makes a compelling case that matching logic provides a natural and general foundation for defining binders. By leveraging the built-in existential quantifier $\exists x.\, \varphi$ as the universal source of binding behavior, the approach avoids the proliferation of ad hoc mechanisms for handling bound variables. The conservative extension theorems provide rigorous correctness guarantees, and the representational completeness result for the $\lambda$ -calculus provides additional assurance that the matching logic semantics faithfully captures the intended meaning of binders. The approach has significant practical implications for the K framework, where programming language definitions frequently involve binding constructs (function definitions, let-bindings, for-loops with scoped variables, exception handlers, etc.), and having a uniform treatment of all binders simplifies both the definitions and the verification infrastructure.

The paper also opens several directions for future work, including extending the approach to dependent type theories and homotopy type theory, applying it to language definitions in K that involve complex binding patterns, and exploring the connections between matching logic's treatment of binders and categorical semantics (specifically, presheaf models and functorial semantics). The work represents an important step toward matching logic's goal of serving as a unified foundation for programming languages and formal verification, where all aspects of a language --- from syntax to type systems to operational semantics to program logics --- can be expressed and reasoned about within a single logical framework.