In this series of posts, we will define a small programming language using formal methods. We will also produce proofs about our definitions using the Lean theorem prover.
Introduction and motivation
There are many programming languages, but most do not have a formal definition. The majority of programming languages are defined by their implementation.
Definition by implementation
An implementation of a programming language is a program . This program takes the text of another program, , and either produces some result based on , or rejects as invalid.
We could then say is a program in the language if , the implementation of , produces output for , that is, it does not reject .
However, a programming language may have multiple implementations. And though these implementations may strive for compatibility between one another, it sometimes happens that different implementations behave differently given the same program as input. In such a case, which one of these implementations is the one that defines the programming language?
In fact, even in a language with only one implementation, the implementation may produce output for a given program that we judge to be “wrong”. We can then fix the bug in the implementation, but this raises the question: what defines the “right” behavior, if not the implementation?
Definition by specification
A fix for this problem is to write a specification for the programming language. Developers then use the specification as a reference when writing implementations of the language.
Languages like C, C++, Go, and JavaScript (technically ECMAScript) are specified in this way.
Specification with mathematics
However, it is possible write a specification not just with words, but with mathematics. Some examples of languages specified in this way are Standard ML, Pony, and WebAssembly.
Formally specifying a language in this way is not a trivial task, but the benefits are real.
For one, a mathematical specification is completely unambiguous. This contrasts with specifications written in human languages, which can be ambiguous. For instance: in that last sentence, what is the thing that I am saying can be ambiguous? Is it specifications or human languages?
Furthermore, using formal methods allows us to state and prove theorems about the specification. This gives us a high degree of confidence that our specification is free of bugs and mistakes.
Let us explore this method of specification by defining our own programming language.
Hatsugen, a small language
First, we will give a name to our language.
There’s an excellent series of blog posts about implementing a programming language. As in that series, we’ll name our language by translating the English word “utterance” into some other human language. I’ll choose the target language to be Japanese, given my interests. Thus, we will call the language “Hatsugen”.
For now, Hatsugen will just have integers and booleans.
We’ll represent integers with integer literal expressions like or or . There are infinitely many integers, and thus infinitely many such possible literals. Practical considerations like a maximum integer size will be ignored for now.
We’ll represent booleans with the literal expressions and .
As one extra bit, we’ll also support a conditional expression
which will evaluate the condition and then
- if it’s , evaluate ;
- if it’s , evaluate .
It’s intentional that Hatsugen is a very small language, at least for now. This will allow us to get comfortable with all the formal stuff without getting bogged down with the complexity of Hatsugen itself.
Syntax
We’ll use a BNF-style grammar to describe the syntax of expressions in Hatsugen.
Since there are infinitely many integers, and writing them all out would take quite a while, we’ll represent an arbitrary integer literal with .
This says that an expression can either be an integer literal, a boolean literal, or an conditional expression.
Note that the conditional expression contains other expressions: , , and .
Dynamics
We now define how to evaluate an expression. This is called the dynamic semantics. There are various ways to do this, but the one we’ll use for Hatsugen is often called structural operational semantics.
To do this, we will define two judgments. To define these judgments, we will write inference rules.
Values:
The first judgement is written and is read as “ is a value”. A value is an expression that is “done” evaluating.
We first define that any integer literal is a value. We can express this definition with an inference rule:
An inference rule looks like a big fraction, with the premises on top and the single conclusion on the bottom. There are no premises for this rule, so the top is empty.
We can also define that the boolean literals and are values with two more rules.
In Hatsugen, these are the only expressions that are values. But what about conditional expressions?
Stepping:
For expressions which are not values, we must now define how to evaluate them. The second dynamics judgement, , read as “ steps to ”, holds when the expression takes a single step to another expression, .
“Evaluating an expression” is thus defined as repeatedly stepping an expression until it is a value.
In Hatsugen, the only expressions that can take a step are conditional expressions. First, we define that a conditional expression can take a step if its condition can take a step to . When it steps, we leave and unchanged.
Now, we define what happens when is a value. If it is , we step to , ignoring .
And if it was , we do the opposite and step to , ignoring .
With that, we’ve defined the dynamic semantics.
But we immediately run into a problem.
The problem
We earlier informally defined what it means to evaluate an expression:
“Evaluating an expression” is… defined as repeatedly stepping an expression until it is a value.
However, under this definition, there exist expressions that cannot be evaluated. For instance:
This expression cannot take a step, because none of the rules for apply. This is because we only defined how to take a step when was either , , or able to take a step itself. We didn’t define what to do if is an integer literal.
Yet, this expression is also not a value, because none of the rules for apply. Thus, the expression is “stuck”.
There are generally two things we can do here. One is to go back and add more rules to the dynamics to define how to evaluate these kinds of expressions. For instance, we could emulate C-like languages with the following rules:
This treats like and any other integer like .
The other thing we could do define a notion of a “valid” expression, and only permit evaluation of these valid expressions. This is the approach we will use.
To define which expressions are valid, we will introduce a static semantics.
Statics:
The static semantics tell us which expressions are valid, and thus permitted to evaluate.
First, we introduce the notion of a type. For now, we have only two: integer and boolean.
Next, we introduce another judgement, , read “ has type ” or “the type of is ”. This judgement defines what it means for an expression to be valid: an expression is valid if there exists a type for which holds.
Now, we define by writing its rules, starting with rules for the literals. Integers literals have integer type, and boolean literals have boolean type.
For conditional expressions, we require that has boolean type. We also require and have the same type (but that type could be any type). Then we say the whole conditional expression has that type.
Note that because must have boolean type, expressions like the previously considered
are now disallowed by the static semantics.
This completes the definition of the static semantics.
Theorems
We can now state and prove theorems about Hatsugen.
The first crucial theorem is that of progress. Progress says that well-typed expressions are either done evaluating or can keep evaluating.
More formally, progress states:
For all and , if , then or there exists such that .
Note that, before we introduced the static semantics, we had the problem of certain expressions neither being values nor being able to step. Now that we have the statics at our disposal, we can use them to strengthen the precondition of the theorem so that it is provable.
Next, we have preservation. Preservation states that well-typed expressions that can keep evaluating preserve their types when evaluating.
Again, more formally:
For all and and , if and , then .
Note that taken together, we get the following safety theorem:
For all and , if , then or there exists such that and .
Crucially, note that in the case where , we also have that . This means that we can apply the safety theorem again on .
The proofs are available on GitHub.
In the next post, we’ll add functions to Hatsugen.