Sometimes it feels like programming languages didn't really change from the 60s up to now. When I feel like that, I often remind myself of the cool tools and features we have now that make our lives easier. We now have things like an integrated debugger, unit tests, static analysis, cool IDEs, typed arrays and others. Language progress is slow and iterative, there are no silver bullets that will come in and change the game.

Today I want to tell you about the next step in this iterative process. People are still researching this technology, but it has the potential to enter mainstream languages soon. And it all starts with one of the most fundamental concepts in computer science: **types**.

### The World of Types

Types are one of those things that are so integrated into our thinking that we hardly ever think about the concept itself. Why is 1 an `int`

but you put quotes around it suddenly turns into a `string`

? What *is* a type, really? Like a lot of things in programming, it depends on who you ask.

Types have many forms. Some type systems draw a very hard line between types and **values**. `3`

, `2`

and `1`

are all *values* of the type `integer`

, but `integer`

is not a value. It is a "baked in" language construct that is fundamentally different from a value. But this distinction is not necessary, and can be limiting.

If you free up types and make them *values* like any other, you enable some cool possibilities. Values can be stored, transformed and passed to functions. This means you can have functions that take a type as a *parameter*, creating generic functions: functions that can work with multiple types without overloading. You can have an array of values of a specific type, as opposed to doing weird pointer arithmetic and casting like you do in C. You can also construct new types wile the program is running, enabling things like automatic JSON deserialization.

But even if you treat types as values, there is still a rift between what you can do with types and what you can do with values. With `user`

instances, you can do things like compare their names, check their age or id, etc.

```
if user.name == "Marin" && user.age < 65 {
print("You can't retire yet!")
}
```

However, if you're doing the same with a `User`

*type*, you can only compare the type name and maybe the property names. Since it's a type and not an instance, you can't check the values of its properties.

```
if typeof(user) == User {
print("Well, it's a user. That's all I know")
}
```

Wouldn't it be cool if you could have a function which can only receive a non-empty list of users? Or a function which would receive only a well-formatted email address? For that to work, you would have to have a type "non empty array" or a type "email address". You would have to have a type which depends on a value, or a **dependent type**. In mainstream languages, this isn't possible.

In order for types to be useful, the compiler needs to verify them. If you say that a variable contains an `integer`

, it better not contain a `string`

, otherwise the compiler will complain. This is good, it stops you from writing bugs. Checking the types is pretty easy: if a function returns an `integer`

, and you try to return `"Marin"`

, that's an error.

But with dependent types things get more complicated. The problem lies in *when* the compiler checks the types. How can you check that an array contains exactly three values, when the program is not even running yet? How can you check that an integer is greater than 3, if that integer is not even assigned yet? The answer: *magic*... or, in other words, *math*. If you can mathematically prove that a set of integers is always larger than 3, then the compiler can check it.

### Math Time!

You use **mathematical induction** for creating proofs. Induction is a way of proving, without doubt, that something is true. For instance, you want to prove that the following mathematical formula is true *for every positive number*:

`1 + 2 + 3 + ... + x = x * (x + 1) / 2`

There are infinite possible `x`

s, so it would take us a very long time to check every number by hand. Thankfully, we don't need to do that. All we have to do is prove two things:

- The statement is true for the first number. (Usually this is 0 or 1)
- If the statement is true for one number
`n`

, it is true for the next number`n + 1`

Because it's true for the first number and also for every successive number, we know that the statement is true for all possible numbers.

Proving that it's true for 1 is pretty easy:

```
1 = 1 * (1 + 1) / 2
1 = 1
```

Now we also need to prove that it works for all *other* numbers. We do this by *assuming* it works for some number `n`

, and check that it also works for `n + 1`

.

```
assuming this is true:
1 + 2 + 3 + ... + n = n * (n + 1) / 2
we check the case for n + 1:
(1 + 2 + 3 + ... + n) + (n + 1) = (n + 1) * ((n + 1) + 1) / 2
we can replace "(1 + 2 + 3 + ... + n)" with the equation above
(n * (n + 1) / 2) + (n + 1) = (n + 1) * ((n + 2) / 2)
and we simplify to
(n + 1) * (n/2 + 1) = (n + 1) * (n/2 + 1)
```

Since both sides of the equals are the same, we know that the statement is true. This is one of the ways you can prove statements without having to manually go through each case, and this is the basis of how dependent types work. You write mathematical proofs that a statement about a type is true.

The beauty of this is that all mathematical proofs can be written as computer programs, which is exactly what we need!

### Back to Programming

So we established that we can prove things are true before we have a concrete value. To do this in an actual programming language, we need a way to encode these statements into the type system itself, which means our type system needs an upgrade.

Let's look at an example. Say we have an `append`

function which takes
two arrays and combines them. Usually, the function signature would look something like this:

`append: (arr1: Array, arr2: Array) -> Array`

But just looking at this signature, we can't be sure the implementation is correct. Just because the function returns an array, doesn't mean it actually did something. One way to check the result is to make sure the length of the result array is a sum of the lengths of the parameter arrays.

```
newArray = append([1], [2, 3])
assert(length(newArray) == 3)
```

But why check this at runtime, when we can have create a constraint that the compiler can check while the program is compiling:

```
append: (arr1: Array, arr2: Array) -> newArray: Array
where length(newArray) == length(arr1) + length(arr2)
```

We declare that `append`

is a function taking two `Array`

arguments and returning a new `Array`

argument which we named `newArray`

. Only this time we add the caveat the length of the new array needs to be the same as the sum of the lengths of the function's arguments. We're turning the above run-time *assertion* into compile-time *type*.

The above code is in the world of *types*, not *values*, so the `==`

is comparing the *return type* of `length`

, and not its value. For this to work, the return *type* of `length`

needs to tell us something about the actual number.

The way to make this work is to make sure each number is a separate type. The type `One`

can only contain one possible value: `1`

. Same for `Two`

, `Three`

and all other numbers. Of course this is really tedious work, but that's why we have programming. We can create a compiler that does this for us.

Once we've done that, we can create a separate type for an array holding 1, 2, 3 and other numbers of items. `ArrayOfOne`

, `ArrayOfTwo`

, etc.

With that, we can define a `length`

function which will take one of the above array types, and have a *dependent* return type of `One`

for `ArrayOfOne`

, `Two`

for `ArrayOfTwo`

and so on, for each number.

Now that we have a separate type of every array length, we can make sure (at compile time) that two arrays have the same length by comparing their types. And since types are values like any other, we can define operations on them. We can define addition on the actual *types*, defining that the sum of `ArrayOfOne`

and `ArrayOfTwo`

is `ArrayOfThree`

.

This is all the info a compiler needs to make sure what you're writing is correct.

Let's say we want to create a variable of type `ArrayOfThree`

:

`result: ArrayOfThree = append([1], [2, 3])`

The compiler can tell that `[1]`

has only one value, so it can assign the type `ArrayOfOne`

. It can also assign `ArrayOfTwo`

to `[2, 3]`

.

The compiler knows that the type of `result`

must be equal to the type of the first argument, plus the type of the second argument. It also knows that `ArrayOfOne`

+ `ArrayOfTwo`

is `ArrayOfThree`

, so it knows that the whole expression on the right hand side is of the type `ArrayOfThree`

. This matches up with that's on the left, which makes the compiler happy.

If we tried to write the following:

`result: ArrayOfTwo = append([1], [2, 3])`

The compiler would not be happy at all, because it knows that the type is wrong.

### Dependent Typing is Pretty Cool

This makes a huge number of bugs simply impossible to create. With dependent typing, you can avoid off-by-one errors, accessing non existent array indices, null pointer exceptions, infinite loops and non performant code.

You can express almost anything with dependent types. A factorial function which only accepts natural numbers, a `login`

function which doesn't accept empty strings, a `removeLast`

function which only accepts non-empty arrays. And all this is checked before you run the program.

The problem with runtime checks is that they fail when the program is already running. This is okay if you're the one running it, but not if the user is. Dependent types let you move those checks to the type system itself, making it impossible to fail while the program is running.

I believe dependent typing is the future of mainstream programming languages, and it can't come soon enough!

### References:

Proof that all mathematical proofs can be written as programs

https://en.wikipedia.org/wiki/Curry–Howard_correspondence

Idris, a dependently typed language

https://www.idris-lang.org

F*, another dependently typed language with a cool looking logo

https://www.fstar-lang.org/tutorial/

Adding dependent typing to JavaScript

http://goto.ucsd.edu/~ravi/research/oopsla12-djs.pdf