Logo
Research

A General Approach to Define Binders Using Matching Logic

Abstract

This paper, authored by Xiaohong Chen and Grigore Rosu and published at ICFP 2020, proposes a novel and general approach to defining binders using matching logic. In programming languages and logical systems, binders are constructs that introduce bound variables --- examples include λx.e\lambda x.\, e in the λ\lambda-calculus, X.T\forall X.\, T in System F, and νx.P\nu x.\, P in the π\pi-calculus. Correctly capturing the binding behavior of such constructs (including α\alpha-equivalence, capture-avoiding substitution, and freshness) has been a longstanding challenge in formal semantics. The key insight of this paper is that matching logic already contains a built-in binder, the existential quantifier x.φ\exists x.\, \varphi, and that object-level binders can be defined so that their binding behavior is directly inherited from this built-in binder. The authors demonstrate that binders in the λ\lambda-calculus, System F, the π\pi-calculus, and pure type systems can all be axiomatically defined in matching logic as notations and logical theories. They prove conservative extension theorems establishing that a sequent or judgment is provable in the original system if and only if it is provable in the corresponding matching logic theory, thus showing that nothing is lost and nothing is gained by moving to matching logic.

Introduction

The problem of formally defining binders is as old as the λ\lambda-calculus itself. When Church introduced the λ\lambda-calculus in the 1930s, he implicitly used conventions about bound variables, α\alpha-equivalence (renaming of bound variables), and capture-avoiding substitution that have since been made rigorous in numerous ways. The challenge is that naive syntactic representations of binders lead to well-known problems: name clashes during substitution, the need for α\alpha-renaming, and the distinction between free and bound variables. Over the decades, several approaches have been developed to handle binders formally, including de Bruijn indices (which replace named variables with numeric indices), higher-order abstract syntax (HOAS) (which uses the meta-language's binding to represent object-level binding), nominal logic (which introduces a theory of names and swapping), and locally nameless representations (which use de Bruijn indices for bound variables and names for free variables).

Each of these approaches has trade-offs. De Bruijn indices eliminate α\alpha-equivalence issues but produce terms that are hard for humans to read. HOAS provides an elegant encoding but limits the ability to inspect the structure of binders. Nominal logic offers a principled treatment of freshness and α\alpha-equivalence but introduces new logical machinery (name swapping, the freshness relation #\#, and the N\mathbf{N} quantifier). The contribution of this paper is to show that matching logic offers a fundamentally different and arguably more uniform approach: binders need not be added as new primitive constructs; instead, they can be defined as notations whose binding behavior is inherited from matching logic's built-in existential quantifier x.φ\exists x.\, \varphi. This approach requires no new logical infrastructure beyond what matching logic already provides, making it attractive for both theoretical foundations and practical implementations.

Background on Matching Logic

Matching logic is a logic for specifying and reasoning about structure. Its formulas, called patterns, are built from variables, symbols (drawn from a signature Σ\Sigma), and logical connectives including the existential quantifier x.φ\exists x.\, \varphi. A pattern is interpreted over a carrier set MM (the domain of the model), and its semantics is a subset of MM --- the set of elements that "match" the pattern. This is a departure from classical first-order logic, where formulas evaluate to truth values. In matching logic, the pattern xx denotes the singleton set {a}\{a\} when xx is assigned the value aa, and a symbol application σ(φ1,,φn)\sigma(\varphi_1, \ldots, \varphi_n) denotes the set σM(φ1,,φn)\sigma^M(\llbracket \varphi_1 \rrbracket, \ldots, \llbracket \varphi_n \rrbracket) where σM\sigma^M is the interpretation of σ\sigma as a function on sets (not just elements).

The key logical connectives include ¬φ\neg \varphi (complement), φ1φ2\varphi_1 \wedge \varphi_2 (intersection), φ1φ2\varphi_1 \vee \varphi_2 (union), and x.φ\exists x.\, \varphi (existential quantification, which unions over all valuations of xx). The pattern \bot denotes the empty set and \top denotes the full carrier set. Crucially, the existential quantifier x.φ\exists x.\, \varphi is a binder: the variable xx is bound in φ\varphi, and all the standard properties of binding (such as α\alpha-equivalence and capture-avoiding substitution) hold for x.φ\exists x.\, \varphi by the definition of matching logic. The definedness symbol φ\lceil \varphi \rceil is defined as φ¬(φ)\lceil \varphi \rceil \equiv \neg(\varphi \leftrightarrow \bot) and evaluates to \top if φ\varphi is non-empty, and \bot otherwise. Equality is defined as φ1=φ2φ1¬φ2¬φ1φ2=\varphi_1 = \varphi_2 \equiv \lceil \varphi_1 \wedge \neg \varphi_2 \rceil \vee \lceil \neg \varphi_1 \wedge \varphi_2 \rceil = \bot. The logic has a Hilbert-style proof system that is sound and complete for its models, with rules including modus ponens (from φ\varphi and φψ\varphi \to \psi, derive ψ\psi), universal generalization (from φ\varphi, derive x.φ\forall x.\, \varphi), and the Propagation axiom (C[]=C[\bot] = \bot for any context CC).

Defining Binders as Matching Logic Notations

The central technical contribution of the paper is a method for defining object-level binders as notations in matching logic. The idea is to express a binder Bx.tB x.\, t (where BB is the binding construct, xx is the bound variable, and tt is the body) as a matching logic pattern that uses the existential quantifier \exists to capture the binding of xx. Specifically, for the λ\lambda-calculus, the λ\lambda-abstraction λx.e\lambda x.\, e is defined as a matching logic notation:

λx.e    lam(x.(pair(x,e)))\lambda x.\, e \;\equiv\; \mathsf{lam}(\exists x.\, (\mathsf{pair}(x, e)))

Here, lam\mathsf{lam} and pair\mathsf{pair} are symbols in the matching logic signature. The existential quantifier x\exists x in x.(pair(x,e))\exists x.\, (\mathsf{pair}(x, e)) binds xx in the pair (x,e)(x, e), and since the binding behavior of \exists is well-defined in matching logic, the binding behavior of λx.e\lambda x.\, e is automatically inherited. The pattern x.(pair(x,e))\exists x.\, (\mathsf{pair}(x, e)) can be thought of as representing the equivalence class of pairs (x,e)(x, e) modulo renaming of xx --- exactly the right semantics for α\alpha-equivalence. This definition ensures that λx.e\lambda x.\, e and λy.e[y/x]\lambda y.\, e[y/x] are logically equivalent in matching logic whenever yy is fresh, mirroring the standard α\alpha-equivalence of the λ\lambda-calculus.

The authors demonstrate that this approach scales to more complex binding constructs. For System F, the type-level universal quantifier X.T\forall X.\, T is similarly defined by using \exists to bind the type variable XX. For the π\pi-calculus, the restriction operator νx.P\nu x.\, P uses \exists to bind the channel name xx. For pure type systems, which generalize many typed λ\lambda-calculi, the dependent product type Πx:A.B\Pi x{:}A.\, B is defined using the same pattern of wrapping the bound variable in an existential quantifier. In each case, the binding behavior is inherited from matching logic's built-in binder rather than being postulated as a new primitive.

Conservative Extension Theorems

The main correctness results of the paper are conservative extension theorems for the λ\lambda-calculus, System F, and other systems. These theorems establish a precise correspondence between provability in the original system and provability in the matching logic theory. For the λ\lambda-calculus, the conservative extension theorem states:

A sequent Γe:τ\Gamma \vdash e : \tau is derivable in the simply-typed λ\lambda-calculus if and only if the corresponding matching logic pattern φΓ(hasType(e,τ)=)\varphi_\Gamma \rightarrow (\mathsf{hasType}(e, \tau) = \top) is provable in the matching logic theory Γλ\Gamma_\lambda.

This theorem is proved by constructing a faithful translation from λ\lambda-calculus derivations to matching logic proofs (soundness) and from matching logic proofs back to λ\lambda-calculus derivations (completeness). The translation maps each λ\lambda-calculus typing rule to a corresponding derived rule in matching logic. The proof requires careful handling of variable binding, substitution, and the interaction between the object-level and meta-level quantifiers. A similar conservative extension theorem is proved for System F, where type abstraction ΛX.e\Lambda X.\, e and type application e[T]e\, [T] are defined as matching logic notations, and the typing rules of System F are derived as matching logic theorems.

The significance of these results is that they demonstrate that matching logic does not alter the deductive power of the original systems. A user working in the matching logic theory of the λ\lambda-calculus can prove exactly the same theorems as a user working directly in the λ\lambda-calculus. The matching logic formulation is not a mere encoding; it is a faithful representation that preserves all the reasoning power of the original system while situating it within a uniform logical framework. This uniformity is valuable for frameworks like K, where multiple languages with different binding constructs must coexist within a single formal environment.

Representational Completeness

Beyond deductive completeness (the conservative extension property), the paper establishes a stronger property for the λ\lambda-calculus called representational completeness. A semantics of the λ\lambda-calculus is representationally complete if every closed normal form has a unique syntactic representation. Formally, a model of the λ\lambda-calculus is representationally complete if for every element dd in the carrier set that is the denotation of some closed normal form, that normal form is uniquely determined. Many existing approaches to λ\lambda-calculus semantics fail to achieve representational completeness. For example, in set-theoretic models, the interpretation of the function space ABA \to B is typically the set of all functions from AA to BB, which contains many elements that do not correspond to any λ\lambda-term.

The matching logic semantics achieves representational completeness because patterns in matching logic denote subsets of the carrier set, and the axiomatization of the λ\lambda-calculus in matching logic ensures that distinct normal forms are interpreted as distinct (singleton) subsets. The proof of representational completeness proceeds by constructing a term model (also known as a Henkin model or a syntactic model) in which the carrier set consists of equivalence classes of λ\lambda-terms modulo αβη\alpha\beta\eta-equivalence, and showing that this model satisfies all the matching logic axioms for the λ\lambda-calculus. In this model, every element is the denotation of a unique normal form (up to α\alpha-equivalence), establishing the representational completeness property. This result distinguishes the matching logic approach from many alternatives and provides additional confidence in the faithfulness of the representation.

Application to Multiple Binding Constructs

The paper systematically applies the binder-definition methodology to several binding constructs beyond the basic λ\lambda-abstraction:

  • System F (polymorphic λ\lambda-calculus): The type abstraction ΛX.e\Lambda X.\, e is defined as tyabs(X.pair(X,e))\mathsf{tyabs}(\exists X.\, \mathsf{pair}(X, e)), directly mirroring the pattern used for λ\lambda-abstraction but at the type level. The type application e[T]e\, [T] and the universal type X.T\forall X.\, T are defined correspondingly.
  • π\pi-calculus: The restriction operator (νx)P(\nu x) P, which introduces a new channel name xx scoped over process PP, is defined as res(x.pair(x,P))\mathsf{res}(\exists x.\, \mathsf{pair}(x, P)). The input prefix x(y).Px(y).P, which binds yy in PP, is defined similarly.
  • Pure type systems: The dependent product Πx:A.B\Pi x{:}A.\, B, where xx is bound in BB, is defined as pi(A,x.pair(x,B))\mathsf{pi}(A, \exists x.\, \mathsf{pair}(x, B)). This definition handles the dependency of BB on xx through the existential quantifier.
  • Let-bindings and pattern matching: The let expression let  x=e1  in  e2\mathsf{let}\; x = e_1 \;\mathsf{in}\; e_2, where xx is bound in e2e_2, follows the same pattern: let(e1,x.pair(x,e2))\mathsf{let}(e_1, \exists x.\, \mathsf{pair}(x, e_2)).

In each case, the conservative extension theorem is established, proving that the matching logic theory faithfully represents the original system. The uniformity of the approach --- always using \exists to capture binding and proving conservative extension by the same methodology --- is a key advantage. It suggests that the approach can be applied to any binding construct in any language or logic, making it a truly general solution to the binder problem.

Conclusion

The paper makes a compelling case that matching logic provides a natural and general foundation for defining binders. By leveraging the built-in existential quantifier x.φ\exists x.\, \varphi as the universal source of binding behavior, the approach avoids the proliferation of ad hoc mechanisms for handling bound variables. The conservative extension theorems provide rigorous correctness guarantees, and the representational completeness result for the λ\lambda-calculus provides additional assurance that the matching logic semantics faithfully captures the intended meaning of binders. The approach has significant practical implications for the K framework, where programming language definitions frequently involve binding constructs (function definitions, let-bindings, for-loops with scoped variables, exception handlers, etc.), and having a uniform treatment of all binders simplifies both the definitions and the verification infrastructure.

The paper also opens several directions for future work, including extending the approach to dependent type theories and homotopy type theory, applying it to language definitions in K that involve complex binding patterns, and exploring the connections between matching logic's treatment of binders and categorical semantics (specifically, presheaf models and functorial semantics). The work represents an important step toward matching logic's goal of serving as a unified foundation for programming languages and formal verification, where all aspects of a language --- from syntax to type systems to operational semantics to program logics --- can be expressed and reasoned about within a single logical framework.

Stay Informed

Sign up for updates and never miss important announcements.

Subscribe

Join the community

Join our Discord server to get support or connect with the Pi² community.

Join discord

Follow us

Learn about company and product updates, upcoming events, rewards, and more.

TwitterTelegramDiscordLinkedInGitHubYouTube