Many programming languages have both terms and types. Terms are also sometimes called expressions.
Roughly speaking, terms, like
true, denote the data being manipulated,
while types, like
Boolean, describe what operations are permitted
For instance, if you have a term of type
Number, you might be able to do
things like add or subtract with other
Numbers. And if you have a term of type
false), you could do things like negate it or use it
to branch with an
Common in most programming languages are functions, to which one may pass a term
and get back a term. For instance, we could define the function
takes a term of type
Number and returns a term of type
Given the existence of both types and terms, though, we can consider four distinct varieties of function:
Let us examine each in detail, and how each is uniquely useful.
As mentioned, the most common variety of function is the one that takes a term
and returns a term, like
Consider the identity function, which is the function that returns its argument unchanged.
The implementation of the identity function is identical for any choice of parameter/return type: just return the term passed in. So, it would be convenient if, instead of having to define a "different" identity function for every type, we could have a single identity function that would work for any type. This is sometimes called a generic function.
What we can do is allow the identity function to take a type argument. Let us
T. We then take a term argument whose type is
T and return it.
The identity "function", then, is actually defined with two functions. First is
a function that takes a type
T and then returns a term. That term is also a
function. It takes a term of type
T, and then returns that term.
Commonly found alongside generic functions are generic types.
For instance, many programming languages provide a list data structure, which is a ordered sequence of elements that can be dynamically added to and removed from. Different programming languages call this data structure different things: list, array, vector, sequence, and so on, but the general idea is the same.
We would like a list data structure to permit the elements stored to be any
fixed type. That is, instead of separately defining
BooleanList, we would like to just define
List, and have it work for any
List itself is not a type, but rather a function that takes a type (the
type of the elements) and returns a type (the type of lists of that element
type). So, if
T is a type, then
List<T> is the type of a list of
Some languages have a fixed-length array type. This is a type which is a bit like a list, but its length is fixed, and thus part of the type itself. Languages like C and Rust permit types like this.
For instance, in Rust, the definition
const A: [u32; 3] = [2, 4, 6];
A to be an array, with a fixed length of 3, of 32-bit unsigned
This is a limited form of allowing terms in types, since here, the term
used in the type
However, Rust rejects the following function type:
fn foo(n: usize) -> [u32; n]
With the following error:
error[E0435]: attempt to use a non-constant value in a constant --> src/lib.rs:1:27 | 1 | fn foo(n: usize) -> [u32; n] | - ^ | | | this would need to be a `const`
We can take the Rust compiler's suggestion and make
n a "const parameter" (and
also capitalize it, to conform to style guidelines):
fn foo<const N: usize>() -> [u32; N]
But now, because
N is a const parameter, we can only pass values for it that
are known at compile time.
[u32; N] that contain, or "depend on", terms, are called dependent
types. Not many programming languages fully support dependent types, likely due
to their incredible expressive power. As seen in the example,
Rust only permits limited usage of these types.
To reiterate, the four varieties of functions are:
Most languages have term-term functions, but choose to allow or disallow the other three varieties of functions. There are three yes-or-no choices to make, and thus possible configurations.
We may visualize the three choices as dimensions, and thus organize the possibilities into a cube. The vertices of the cube represent languages that arise from choosing combinations of allowing or disallowing the three varieties of function. All vertices on the cube allow for term-to-term functions.
Some commonly-known vertices on the cube are shown below. Columns 1-4 correspond to the 4 varieties of function discussed.
|Simply typed lambda calculus||✓||×||×||×|
|Calculus of constructions||✓||✓||✓||✓|
Once we reach the calculus of constructions (CoC), the distinction between types and terms somewhat disappears, since each may freely appear in both themselves and the other. Indeed, as powerful as the CoC is, it has a very sparse syntax of terms, fully described by the following context-free grammar:
There is no separate syntax for types in the CoC: all terms and types are represented with just the above syntax.
I wrote up an implementation of the CoC in Rust for edification.
The calculus of constructions serves as the foundation for many dependently-typed programming languages, like Coq. Using the CoC as a foundation, Coq is able to express and prove mathematical theorems like the four-color theorem.
It's rather remarkable to me that functions and variables, the most basic realization of the concept of "abstraction", can be so powerful in allowing all different types of language features. In the words of jez, on variables:
I think variables are just so cool!
I think it's straight-up amazing that something so simple can at the same time be that powerful. Functions!