Define a PL: integers and booleans

Translations: English日本語

In this series of posts, we will define a small programming language using formal methods. We will also produce proofs about our definitions using the Lean theorem prover.

Introduction and motivation

There are many programming languages, but most do not have a formal definition. The majority of programming languages are defined by their implementation.

Definition by implementation

An implementation of a programming language LL is a program pLp_L. This program pLp_L takes the text of another program, pp, and either produces some result based on pp, or rejects pp as invalid.

We could then say pp is a program in the language LL if pLp_L, the implementation of LL, produces output for pp, that is, it does not reject pp.

However, a programming language may have multiple implementations. And though these implementations may strive for compatibility between one another, it sometimes happens that different implementations behave differently given the same program as input. In such a case, which one of these implementations is the one that defines the programming language?

In fact, even in a language with only one implementation, the implementation may produce output for a given program pp that we judge to be “wrong”. We can then fix the bug in the implementation, but this raises the question: what defines the “right” behavior, if not the implementation?

Definition by specification

A fix for this problem is to write a specification for the programming language. Developers then use the specification as a reference when writing implementations of the language.

Languages like C, C++, Go, and JavaScript (technically ECMAScript) are specified in this way.

Specification with mathematics

However, it is possible write a specification not just with words, but with mathematics. Some examples of languages specified in this way are Standard ML, Pony, and WebAssembly.

Formally specifying a language in this way is not a trivial task, but the benefits are real.

For one, a mathematical specification is completely unambiguous. This contrasts with specifications written in human languages, which can be ambiguous. For instance: in that last sentence, what is the thing that I am saying can be ambiguous? Is it specifications or human languages?

Furthermore, using formal methods allows us to state and prove theorems about the specification. This gives us a high degree of confidence that our specification is free of bugs and mistakes.

Let us explore this method of specification by defining our own programming language.

Hatsugen, a small language

First, we will give a name to our language.

There’s an excellent series of blog posts about implementing a programming language. As in that series, we’ll name our language by translating the English word “utterance” into some other human language. I’ll choose the target language to be Japanese, given my interests. Thus, we will call the language “Hatsugen”.

For now, Hatsugen will just have integers and booleans.

We’ll represent integers with integer literal expressions like 123\mathsf{123} or 456\mathsf{-456} or 0\mathsf{0}. There are infinitely many integers, and thus infinitely many such possible literals. Practical considerations like a maximum integer size will be ignored for now.

We’ll represent booleans with the literal expressions true\mathsf{true} and false\mathsf{false}.

As one extra bit, we’ll also support a conditional expression

if e1 then e2 else e3\mathsf{if} \ e_1 \ \mathsf{then} \ e_2 \ \mathsf{else} \ e_3

which will evaluate the condition e1e_1 and then

It’s intentional that Hatsugen is a very small language, at least for now. This will allow us to get comfortable with all the formal stuff without getting bogged down with the complexity of Hatsugen itself.


We’ll use a BNF-style grammar to describe the syntax of expressions in Hatsugen.

Since there are infinitely many integers, and writing them all out would take quite a while, we’ll represent an arbitrary integer literal with n\overline{n}.

e::= n true false if e1 then e2 else e3 \begin{aligned} e ::= \ & \overline{n} \\ | \ & \mathsf{true} \\ | \ & \mathsf{false} \\ | \ & \mathsf{if} \ e_1 \ \mathsf{then} \ e_2 \ \mathsf{else} \ e_3 \end{aligned}

This says that an expression ee can either be an integer literal, a boolean literal, or an conditional expression.

Note that the conditional expression contains other expressions: e1e_1, e2e_2, and e3e_3.


We now define how to evaluate an expression. This is called the dynamic semantics. There are various ways to do this, but the one we’ll use for Hatsugen is often called structural operational semantics.

To do this, we will define two judgments. To define these judgments, we will write inference rules.

Values: e vale \ \mathsf{val}

The first judgement is written e vale \ \mathsf{val} and is read as “ee is a value”. A value is an expression that is “done” evaluating.

We first define that any integer literal is a value. We can express this definition with an inference rule:

n val \frac {} {\overline{n} \ \mathsf{val}}

An inference rule looks like a big fraction, with the premises on top and the single conclusion on the bottom. There are no premises for this rule, so the top is empty.

We can also define that the boolean literals true\mathsf{true} and false\mathsf{false} are values with two more rules.

true val \frac {} {\mathsf{true} \ \mathsf{val}} false val \frac {} {\mathsf{false} \ \mathsf{val}}

In Hatsugen, these are the only expressions that are values. But what about conditional expressions?

Stepping: eee \mapsto e'

For expressions which are not values, we must now define how to evaluate them. The second dynamics judgement, eee \mapsto e', read as “ee steps to ee'”, holds when the expression ee takes a single step to another expression, ee'.

“Evaluating an expression” is thus defined as repeatedly stepping an expression until it is a value.

In Hatsugen, the only expressions that can take a step are conditional expressions. First, we define that a conditional expression can take a step if its condition e1e_1 can take a step to e1e_1'. When it steps, we leave e2e_2 and e3e_3 unchanged.

e1e1if e1 then e2 else e3if e1 then e2 else e3 \frac {e_1 \mapsto e_1'} { \begin{aligned} &\mathsf{if} \ e_1 \ \mathsf{then} \ e_2 \ \mathsf{else} \ e_3 \mapsto \\&\mathsf{if} \ e_1' \ \mathsf{then} \ e_2 \ \mathsf{else} \ e_3 \end{aligned} }

Now, we define what happens when e1e_1 is a value. If it is true\mathsf{true}, we step to e2e_2, ignoring e3e_3.

if true then e2 else e3 e2 \frac {} { \mathsf{if} \ \mathsf{true} \ \mathsf{then} \ e_2 \ \mathsf{else} \ e_3 \ \mapsto e_2 }

And if it was false\mathsf{false}, we do the opposite and step to e3e_3, ignoring e2e_2.

if false then e2 else e3e3 \frac {} { \mathsf{if} \ \mathsf{false} \ \mathsf{then} \ e_2 \ \mathsf{else} \ e_3 \mapsto e_3 }

With that, we’ve defined the dynamic semantics.

But we immediately run into a problem.

The problem

We earlier informally defined what it means to evaluate an expression:

“Evaluating an expression” is… defined as repeatedly stepping an expression until it is a value.

However, under this definition, there exist expressions that cannot be evaluated. For instance:

if 1 then 2 else 3 \mathsf{if} \ \mathsf{1} \ \mathsf{then} \ \mathsf{2} \ \mathsf{else} \ \mathsf{3}

This expression cannot take a step, because none of the rules for eee \mapsto e' apply. This is because we only defined how to take a step when e1e_1 was either true\mathsf{true}, false\mathsf{false}, or able to take a step itself. We didn’t define what to do if e1e_1 is an integer literal.

Yet, this expression is also not a value, because none of the rules for e vale \ \mathsf{val} apply. Thus, the expression is “stuck”.

There are generally two things we can do here. One is to go back and add more rules to the dynamics to define how to evaluate these kinds of expressions. For instance, we could emulate C-like languages with the following rules:

if 0 then e2 else e3e3 \frac {} { \mathsf{if} \ \mathsf{0} \ \mathsf{then} \ e_2 \ \mathsf{else} \ e_3 \mapsto e_3 } n0if n then e2 else e3e2 \frac {\overline{n} \ne \mathsf{0}} { \mathsf{if} \ \overline{n} \ \mathsf{then} \ e_2 \ \mathsf{else} \ e_3 \mapsto e_2 }

This treats 0\mathsf{0} like false\mathsf{false} and any other integer like true\mathsf{true}.

The other thing we could do define a notion of a “valid” expression, and only permit evaluation of these valid expressions. This is the approach we will use.

To define which expressions are valid, we will introduce a static semantics.

Statics: e:τe: \tau

The static semantics tell us which expressions are valid, and thus permitted to evaluate.

First, we introduce the notion of a type. For now, we have only two: integer and boolean.

τ::= Int Bool \begin{aligned} \tau ::= \ & \mathsf{Int} \\ | \ & \mathsf{Bool} \end{aligned}

Next, we introduce another judgement, e:τe: \tau, read “ee has type τ\tau” or “the type of ee is τ\tau”. This judgement defines what it means for an expression to be valid: an expression ee is valid if there exists a type τ\tau for which e:τe: \tau holds.

Now, we define e:τe: \tau by writing its rules, starting with rules for the literals. Integers literals have integer type, and boolean literals have boolean type.

n:Int \frac {} {\overline{n}: \mathsf{Int}} true:Bool \frac {} {\mathsf{true}: \mathsf{Bool}} false:Bool \frac {} {\mathsf{false}: \mathsf{Bool}}

For conditional expressions, we require that e1e_1 has boolean type. We also require e2e_2 and e3e_3 have the same type (but that type could be any type). Then we say the whole conditional expression has that type.

e1:Boole2:τe3:τif e1 then e2 else e3:τ \frac { e_1: \mathsf{Bool} \hspace{1em} e_2: \tau \hspace{1em} e_3: \tau } {\mathsf{if} \ e_1 \ \mathsf{then} \ e_2 \ \mathsf{else} \ e_3: \tau}

Note that because e1e_1 must have boolean type, expressions like the previously considered

if 1 then 2 else 3 \mathsf{if} \ \mathsf{1} \ \mathsf{then} \ \mathsf{2} \ \mathsf{else} \ \mathsf{3}

are now disallowed by the static semantics.

This completes the definition of the static semantics.


We can now state and prove theorems about Hatsugen.

The first crucial theorem is that of progress. Progress says that well-typed expressions are either done evaluating or can keep evaluating.

More formally, progress states:

For all ee and τ\tau, if e:τe: \tau, then e vale \ \mathsf{val} or there exists ee' such that eee \mapsto e'.

Note that, before we introduced the static semantics, we had the problem of certain expressions neither being values nor being able to step. Now that we have the statics at our disposal, we can use them to strengthen the precondition of the theorem so that it is provable.

Next, we have preservation. Preservation states that well-typed expressions that can keep evaluating preserve their types when evaluating.

Again, more formally:

For all ee and ee' and τ\tau, if e:τe: \tau and eee \mapsto e', then e:τe': \tau.

Note that taken together, we get the following safety theorem:

For all ee and τ\tau, if e:τe: \tau, then e vale \ \mathsf{val} or there exists ee' such that eee \mapsto e' and e:τe': \tau.

Crucially, note that in the case where eee \mapsto e', we also have that e:τe': \tau. This means that we can apply the safety theorem again on ee'.

The proofs are available on GitHub.

In the next post, we’ll add functions to Hatsugen.