Unit in the Last Place

What's the smallest number greater than 0? This is one of those simple questions with complicated answers somewhere between "there is none" and "it depends".

If you ask a mathematician, they will tell you that there can't be such a number because it would break math. If you have a number n, where n is the smallest number after 0, then there can't be a number n/2, because n is already the smallest. This means that division itself breaks down, which mathematicians don't like.

If you ask a computer, you will actually get an answer. As opposed to the real world, computers don't have an infinite amount of numbers because they simply couldn't fit. Computers store numbers in memory registers, and each register has a fixed number of bits. Imagine if you only had three digits. The biggest number you could represent would be 999. This is what happens in a computer.

This means that the set of integers in a computer is limited by the number of digits. In a computer, there is definitely a largest number (for instance, INT_MAX in C). It's the number with the maximum amount of digits, all of which are set to a binary 1. On an 8-bit system, this number would be 11111111.

We can complicate matters even more by including negative numbers. If we have 8 bits of data, we can use the first bit to represent the sign of the number. 0 for plus and 1 for minus. Now we have 7 bits for our digits, so the biggest number is 011111111 which is smaller than our previous biggest number.

We're still not done. We also need to represent decimal numbers. Even if 0.12 is a small number, it still has three digits, just like 123. The difference is there's one more thing we need to think about: the decimal dot, also called the radix point. We need to store both the digits, and the position of the radix point.

While integers are limited in how large they can be, decimal numbers are limited in both size and precision. If you have a fixed number of digits, there's only so many digits you can put after the decimal dot. This is why computers need to round decimal numbers.

How do we store these decimal numbers, then? Computers only understand integers, so we need a way to store a decimal number only using integers.

Say we have the number 3.14. Let's start by writing our all the digits of the number. We get 314. That's a start. We know that by multiplying by powers of 10, we can "move" the decimal point around the number. 314 * 10^-1 is 31.4, while 314 * 10^-2 is 3.14.

All we need to represent the number 3.14 is three integers: 314, 10 and -2. 314 is what's called a significand, and this is all the digits of the number written out.

10 is called a radix or a base. We know that by multiplying with powers of 10 we can move the decimal dot around numbers in base 10. The same works for all number bases: in base 2 (or binary), you can shift the dot by multiplying by powers of 2.

The power we shift by is called the exponent, and it tells us where the decimal dot is.

You can write every decimal number as those three numbers with a simple formula:

number = significand * base^exponent

3.14 = 314 * 10^-2
32.8 = 328 * 10^-1

A computer stores a decimal number by storing the sign, exponent and the significand in a single string of 32 or 64 bit digits. Usually there's 1 bit for the sign, 11 bits to store the exponent and 53 bits to store the significand, adding up to 64.

With that in mind, let's get back to our question: What is the smallest non-zero number? If we only have three digits to spare, the smallest possible number is 0.01. With four digits, it's 0.001. You'll notice a pattern here: the significand is always the same, only the exponent changes.

What we need is a significand of 1, because that's the smallest one after 0. We then need to shift the decimal point as far as we can to the left. To do this we need the smallest (most negative) possible exponent.

How small, depends on the layout of the number in memory. If we have 11 bits for the exponent, we can only write out a number which is 10 bits long, with 1 bit reserved for the sign. In a 64-bit system the smallest exponent is -308.

In the end, the smallest possible number in a 64-bit system would be around 1 * 10^-308. That's small!

We established that there is a smallest number. This number tells us how much we can trust our computer. If you're doing something that requires very large numbers or very precise numbers, you need to keep this number in mind.

What we just calculated is something called the unit in the last place, or ulp, of 0. Aside from being a really cool word, ulp tells us what is the minimum distance between two numbers in a computer. We calculated the ulp of 0, which is the minimum distance between 0 and the next number.

If you add the calculated value to 0 and try to compare them, they would not be the same number. However, if you add a value smaller than the ulp, it would still be the same number as far as the computer is concerned.

print(0 == 0 + ulp(0)) // false
print(0 == 0 + ulp(0) / 2) // true

To us it's obvious that adding a non-zero value to a number will produce a different number, but a computer has to round somewhere, so it can't necessarily tell if two numbers are the same.

To compare computer systems more easily, we use the ulp of 1, and call it the machine epsilon. Once you know the machine epsilon, you can calculate any other ulp with the following formula:

ulp(x) = machine epsilon * radix^exponent(x)

The value we calculated is very small, so you probably won't hit that limit while coding. But, we calculated the value for 0. With more digits needed for the left side of the decimal dot, the fewer we have for the right side. This means that the bigger the number, the less precision you have. In other words, the ulp is a direct function of the exponent. As you shift the decimal dot to the right, the ulp increases, and you lose precision.

I hope this information will help you next time you get a weird floating point error in your code. Remember, computers are pretty powerful, but even a computer has its limits.

References:

Unit in the last place - Wikipedia
https://en.wikipedia.org/wiki/Unit_in_the_last_place

Machine epsilon - Wikipedia
https://en.wikipedia.org/wiki/Machine_epsilon

IEEE-754 - A standard for representing floating point numbers on computers
https://en.wikipedia.org/wiki/IEEE_754

Stack Based Programming

Dependent Type