Research

Defining the Undefinedness of C

Abstract

The C programming language specification carries a dual mandate: it must describe the behavior of correct programs while simultaneously identifying incorrect ones. Prior formal semantics of C, such as those by Norrish, Papaspyrou, Blazy and Leroy, and Ellison and Rosu, have primarily focused on the "positive" semantics -- giving meaning to well-defined programs. This paper by Hathhorn, Ellison, and Rosu presents a "negative" semantics of C11 that goes beyond correctness to actively reject undefined programs. The authors extended and modified an existing K Framework-based semantics of C into one that captures undefined behavior (UB) comprehensively. The effort required to achieve this was unexpectedly large, nearly doubling the size of the original semantics from 1,163 rewrite rules to 2,155 rules. From this semantics, they automatically extracted an undefinedness checker called kcc that detects examples of all 77 categories of core language undefinedness in the C11 standard.

Undefined behavior is "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements." -- C11 Standard, Section 3.4.3

Taxonomy of Undefined Behavior in C11

The C11 standard lists 203 places where it invokes undefined behavior. These fall into three broad categories: 77 involve core language features, 24 involve preprocessing and parsing phases (translation phases 1-6), and 101 involve the standard library. The authors further classify each behavior by the point in translation or execution at which it can be recognized: compile time behaviors (24 categories) are detectable through per-translation-unit static analysis, link time behaviors (8 categories) require whole-program static analysis, and run time behaviors (45 categories) can only be caught during execution. Even behaviors categorized as compile-time may only render individual executions undefined, but the authors argue that strictly-conforming programs should be free of all such behaviors.

The Role and Consequences of Undefinedness

Undefinedness in C serves multiple purposes, though it creates severe consequences. It enables aggressive compiler optimizations -- for example, the restrict qualifier allows alias analysis that reduces memory accesses, and the undefined behavior of signed integer overflow allows compilers to assume loops like for (int i = 0; i <= N; ++i) will terminate after exactly $N + 1$ iterations. However, undefinedness causes profound problems: dereferencing NULL in int main() { *NULL; return 0; } does not cause GCC, Clang, or ICC to generate a program that raises an error. Compilers may freely assume undefined behavior will not occur, leading to surprising consequences such as removing overflow-checking branches like if (x + 1 < x) entirely. Furthermore, undefinedness can invalidate an entire execution retroactively, affecting behavior that "came before" the undefined expression due to compiler reordering optimizations.

The K Semantic Framework and kcc

The authors developed their semantics in the rewriting-based K semantic framework, inspired by rewriting logic (RL). In K, parts of the program state are represented as labeled, nested multisets called cells (e.g., k for computation, env for environments, mem for memory). The configuration from the C semantics contains around 100 cells in the execution semantics and another 60 in the translation semantics. Rewrite rules describe steps of computation, and K's notation allows rules to mention only those cells relevant to the rule -- cell context is automatically inferred, making rules modular and robust under language extensions. The tool kcc wraps the semantics to behave like a C compiler, processing programs through a preprocessor/parser, translation semantics, and execution semantics pipeline.

Formalizing Pointer Dereferencing and Unsequenced Operations

Capturing undefinedness in expressions required significantly more complex rules than those needed for defined programs. Consider pointer dereferencing: the basic positive rule simply maps *(L : ptrType(T)) to [L] : T. To catch undefined behaviors such as dereferencing void or NULL pointers, the rule must add conditions: T != void and L != NULL. The fully verbose version must also verify the location is still alive (by matching an object in memory), check the pointer is in bounds, and account for restrict qualifier misuse and strict aliasing rules:

$\langle \cdots B \mapsto \text{object}(\_, Len, \_) \cdots \rangle_{\text{mem}} \quad \langle *(sym(B) + O : \text{ptrType}(T)) \cdots \rangle_k \quad \text{when } T \neq \text{void} \wedge O < Len$

Unsequenced reads and writes present another critical challenge. Unsequenced writes to the same object, or an unsequenced write and read, invoke undefined behavior (UB #35). For example, return (x = 1) + (x = 2); is undefined because the operands have unspecified evaluation order. The semantics tracks all locations modified since the last sequence point in a set called locs-written, checking this set before every write and read operation, and emptying it at each sequence point.

Type Modifiers and the restrict Qualifier

Type modifiers like const, restrict, volatile, _Atomic, alignment specifiers, and function specifiers are another area where positive semantics can safely ignore them but negative semantics cannot. The const qualifier, for instance, has no effect in correct programs but must be tracked to catch writes to const-qualified objects through non-const-qualified lvalues. The restrict qualifier is particularly complex: the standard defines its meaning entirely in terms of what becomes undefined, requiring tracking of which restrict-qualified pointers any particular pointer expression is based on, maintaining maps between object addresses and restrict-qualified pointer sets, and maps between blocks and modified object addresses. A dereference of pointer expression $p$ referencing object $x$ is well-defined only when $R = R'$ , where $R$ is the set of active restrict-qualified pointers associated with $x$ from previous accesses and $R'$ is the set that $p$ is based on.

Translation Phases, Linking, and Memory Model

The semantics provides a separate treatment of the three major phases of C implementations: compilation (translation phase 7), linking (translation phase 8), and execution. The translation phase handles building typing environments, processing declarations through four states of increasing definedness: declared → completed → allocated → defined. The linking phase resolves external identifiers, verifies type compatibility across translation units, and detects duplicate definitions. The memory model uses symbolic base/offset pairs $sym(B) + O$ as addresses, where $B$ is symbolic to prevent inappropriate direct comparisons. Pointers are split into symbolic bytes using subObject for storage. The model handles indeterminate memory (using trap representations for uninitialized values), strict aliasing (tracking effective types of objects to catch type-punning violations), and pointer provenance using tags (fromUnion, fromArray, basedOn, align) that follow pointer values through function calls and memory stores.

Evaluation and Comparison with Existing Tools

The evaluation against existing tools demonstrated the comprehensiveness of the approach. Against the Juliet Test Suite (4,113 extracted undefined-behavior tests), kcc achieved 100% detection across all six categories (invalid pointers, division by zero, bad free arguments, uninitialized memory, bad function calls, integer overflow). Against the authors' own test suite of 261 tests covering all 77 core language undefined behaviors, kcc passed 99.2% compared to Astree (44.8%), CompCert interpreter (53.3%), Valgrind (5.4%), and Frama-C Value Analysis (47.9%). Critically, kcc detected all 77 categories of core language UB, while the combined coverage of all other tools reached only 58 out of 77. Nineteen categories of UB were detected by kcc alone, including restrict qualifier misuse, alignment specifiers, variable-length arrays, the inline function specifier, and unsequenced side-effects.

Conclusion

The authors conclude that undefinedness is a pervasive feature of C that facilitates aggressive optimizations but is terribly subtle and the source of many bugs. Their work demonstrates that capturing undefined behavior is far from a trivial extension of positive semantics -- it requires nearly as much semantic infrastructure as defining correct program behavior. The tool, semantics, and test suite are available at https://github.com/kframework/c-semantics, providing the most comprehensive and complete semantic treatment of undefined behavior in C to date. Future work includes expanding parameterization over implementation-defined behaviors to make the tool more practical for real-world applications, where checking conformance with a specific implementation is often more desirable than strict standard conformance.

Stay Informed

Learn about company and product updates, upcoming events, rewards, and more.