Recursion & Tail Call Optimization

Recursion is, simply put, a function calling itself.

def someFunction(x):
    someFunction(x - 1)

If you try to run the above code, you'll notice it will run in an infinite loop until something on your machine crashes terribly. This is why every recursive function has two parts: a base case and recursive case. In the above example, we only have a recursive case: that's the part where you recursively call the function. For the function to ever complete, we need a way to break out of the loop.

def someFunction(x):
    if (x == 0)
      return 0;
    someFunction(x - 1)

A base case is usually a very simple return value, based on some sort of check. In the above example we are checking if we reduced x to 0, and in that case we simply return 0, breaking the recursion.

Recursion lets you solve a bunch of different problems. The canonical example of recursion is a factorial function. The factorial of a natural number is the product of all natural numbers smaller or equal than that number. For instance, the factorial of 4 is 4 * 3 * 2 * 1.

To implement this for an arbitrary number, we need some way to keep multiplying and decrementing the number, all the way until we reach 1. There's one thing that makes this easy: the factorial of 4, is the same as factorial of 3 (3 * 2 * 1) multiplied by 4. We can generally say that fact(x) = x * fact(x - 1). Using this, we write our function.

def fact(n):
    if (n <= 1)
        return 1;

    return n * fact(n - 1);

We can recursively call the fact function until we reach 1. When we call the function fact(4), it will call fact(3), and then that function will call fact(2), and so on. When the base case is reached, the function returns 1 to fact(2). fact(2) then finishes executing and returns 2 * 1. fact(3) can then return 3 * 2 * 1, and so on until the root function returns.

We defined the factorial function with just a few lines of code, and it looks similar to the mathematical definition from above.

If you don't like recursion, you can also write this as a regular old loop.

def fact(n):
    returnValue = 1;
    for (i = 1; i < n; i++)
        returnValue *= i

    return returnValue

In fact, you can write all recursive programs as loops, and vice versa. This is because recursion is Turing complete. Every program in the world can be written using recursion. Since this is also true for loops, both of these tools can be used to solve the same problems. At least in theory.

The Trouble With Recursion

Our two fact functions behave exactly the same as far as we can see, but behind the scenes there might be big and very important differences. It has to do with the way computers call functions.

Your code does not get executed line-by-line, it has to keep jumping around. When you call a function, you are jumping to the code inside that function, and then returning back to the call site. The compiler needs to keep track of these jumps so it doesn't lose its place. It does this by saving the location it needs to return to in the call stack.

A call stack is a stack that contains function calls, their parameters and local variables, as well as the return address. This allows the compiler to go back from the code inside the function to where the function was called. It's the compiler's version of leaving breadcrumbs so it knows how to find its way back.

When you call a function, the compiler adds the function, its parameters, and the return location to the stack. This function will only get cleared off the stack once it's done running.

[(fact, n = 3, returnAddress = 1)]

If fact calls another function, the compiler will add another item to the stack. In our case, fact is calling itself with 2, so our stack would have two elements.

[(fact, n = 3, returnAddress = 1),
 (fact, n = 2, returnAddress = 5)]

This other function calls fact(1), incrementing the length of the stack to three.

[(fact, n = 3, returnAddress = 1),
 (fact, n = 2, returnAddress = 5),
 (fact, n = 1, returnAddress = 5)]

fact(1) will return 1, and that's is when the compiler will pop fact(1) off the stack, and resume execution at the returnAddress. It will do this over and over until the root function returns.

To calculate the factorial of 3 recursively, we had to store three function calls at once. This is much worse than implementing the function with a for loop. With a for loop, you are not jumping to a new function, so there's no need to add items to the call stack. Our loop example takes up much less memory than the recursive one.

For small numbers this is not a problem, but for larger numbers and functions with a lot of data, memory consumption builds up. Recursive functions expect the compiler to store all of their calls at once. If the compiler runs out of memory, you will get a stack overflow exception. A problem so bad that they named a whole Q&A site after it.

The Fix

The above is only true for some recursive functions and in some languages. There is a way to go around the whole stack overflow issue, and it's called tail call optimization.

A tail call is simply the last instruction in a function. In the following function, the tail call is adding the numbers 2 and 5:

def f():
    print("I am some instruction!")
    return 2 + 5

A tail call is the last instruction that will be executed, not the instruction that is last in the text. For instance, in our fact example, the tail call is performing multiplication, and not the recursive call, because the multiplication operands have to get evaluated before the multiplication can take place.

Functions that do call themselves as the last instruction are called tail recursive functions. Tail recursive functions don't have the problem of stack overflows, because we can do a little trick that avoids adding new items onto the call stack.

There is nothing left to do inside a function after the tail call. You won't return back to the function after the tail call. This means you don't need to pass a return address to the function called inside the tail call. Since the whole point of the call stack is knowing where you need to return, you don't need a new item in the stack if you're calling a function inside a tail call.

To fix our memory issue, we can rewrite our fact function to be tail recursive.

def fact(n, i):
  if n < 0
    return 0
  if n == 0
    return 1
  if n == 1:
    return i
    return fact(n - 1, n * i)

If we want to calculate the factorial of 3, we would call fact(3, 1). This function behaves the same as our previous one, but you'll notice that the last instruction is a call to fact. This means that this function is tail recursive.

Why is this better than our previous example? Let's take a look at the call stack to see what's going on. When we call fact(3, 1) a new item is added to the call stack.

[(fact, n = 3, i = 1, returnAddress = 1)]

The function will call fact(2, 3). However, since we are at the end of the function, we no longer need the function's item in the call stack. It also means the return address for the next call will be the same as it is for the current call. So, we can make a shortcut: instead of allocating a new item on the stack, we simply use the one we already have, and just replace the parameters.

[(fact, n = 2, i = 3, returnAddress = 1)]

We can keep doing this for every recursive call until we get a result. During the whole time of the factorial calculation we are only using a maximum of one item in the call stack. And the best part is this is true no matter how large the number is!

Tail call elimination makes tail recursive functions as efficient as loops. Some languages only have recursion, and no loops. If it weren't for this optimization, they would be constantly running our of memory.

It's important to note that not all languages do this. Even mainstream languages like C and Java don't have tail call optimization. Newer languages like Swift and Kotlin perform tail call optimization, as do most functional programming languages.

Recursion Is Cool*

Recursion lets you write your code in a different, more expressive style. You can elegantly solve a lot of problems and write your code in a more mathematical style. Just keep in mind the memory and performance requirements of recursion in your language, and make sure you recurse in your tail call!

Prototypal Inheritance

Just in Time Compiler