Visualising recursive functions

I’ve been doing a lot of coding practise, and recursive functions are one of those things that’s easier to visualise than think about.

At it’s heart, a recursive function is one that calls itself in over and over again until a result is found. At first that doesn’t make sense so let’s look at an example.

Factorial

The factorial of a number is the product off all numbers from that number down to 1. Two examples follow:

3! = 3 X 2 X 1 = 6
4! = 4 X 3 X 2 X 1 = 24
etc...

Looking closely, the factorial of any number is the product of that number and the factorial of the number below. i.e. The factorial of 4 is the same as 4 X the factorial of 3. Carrying on with this example, the factorial of 3 is the same as 3 X the factorial of 2.

This is a perfect use case of a recursive function, as a function can work out the factorial of any number by multiplying it by the factorial of a number 1 smaller. It is important to ensure that this recursiveness does not happen indefinitely. If a function calls itself forever, you’ll have a loop that never ends. In the above factorial, as soon as we get to 1, there is no need to call the function again, rather that should be used as an exit to the function.
a764b1ed93f5fae80373f990de499c79ef0e2b0b3f950cb6b42ed9294de3b947

I’ll write up a quick factorial recursive function in C:

#include <stdio.h>
#include <stdlib.h>

int factorial(int number) {

    //provide a way out of the function
    if (number == 1) {
        return 1;
    }
    
    // otherwide multiply value with factorial of smaller number
    else {
        return number * factorial(number - 1);
    }
}

int main(int argc, char* argv[]) {

    int number = atoi(argv[1]);
    printf("The factorial of %i is %i\n", number, factorial(number));
}

So the program is simple enough. Take a number via an argument, convert it to an integer, then print out the factorial of it. Let’s pass a few values to see the result:

$ ./factorial 3
The factorial of 3 is 6
$ ./factorial 4
The factorial of 4 is 24
$ ./factorial 5
The factorial of 5 is 120
$ ./factorial 6
The factorial of 6 is 720

If I pass a factorial of 4, what I’m actually doing is as follows:

4 X (3 X (2 X (1)))

Each call to factorial will call the function again with a number 1 smaller. This continues until the number is one. This is then returned to the previous function, returned to the previous function, and so on. i.e. The function will be continue to call itself until it gets to 1. That value will then be returned to the calling function and so on. In essence it gets to 1, then works it way out from the inside.

If I left out the case where number == 1, this function would loop forever. Let’s give this a test:

int factorial(int number) {
    return number * factorial(number - 1);
}
$ ./factorial 3
Bus error (core dumped)

Nope. I do wonder how far it got before it gets to this message though.

int factorial(int number) {
    printf("number is now %i\n", number);
    return number * factorial(number - 1);

}
$ ./factorial 3 > error.txt
Bus error (core dumped)

$ tail error.txt 
number is now -261597
number is now -261598
number is now -261599
number is now -261600
number is now -261601
number is now -261602
number is now -261603
number is now -261604
number is now -261605
number is now -261606

So we got to -261606 before this program crashed. But why did it crash? Should this function not just continue on forever, subtracting 1 each time? The reason has to do with how recursive functions are viewed in memory. Looking back at the factorial above, I noted that this is how the function saw the factorial of 4:

4 X (3 X (2 X (1)))

The computer needs to work this out from the inside out. i.e. It needs to get to number 1, to return that value to the previous call, which then passes that number back to the previous call to the function. Each time a function calls itself, that function gets added to the stack.

e7633bedf897bb24ce668ac9c5df6bf88a58ff7e114d27606a756f4c4888a3f1
Let’s visualise this. The program has just started and we have a stack up on which main has been called:
Untitled drawing
main() calls factorial(), which gets added to the stack:
Untitled drawing (1)

There is a finite limit to how big the stack is, and so a function can only call itself so many times. The larger the original number in our factorial calculation, the larger the amount of times our function calls itself. At some point we are going to run out of space on our stack. Remember one we get to the end, the result of the final calculation will be passed to the previous function, where it’s result will then be passed back and so on. Storing those functions and variables take space.
Untitled drawing (2)
At which point the program will crash. I’ll now change the program again to show the memory location of the variable passed into it each time. A new memory location will be used each time, until we run out:

int factorial(int number) {

    int* pointer = &number;
    printf("number is now %i and memory location is %p\n", number, pointer);
    
    return number * factorial(number - 1);
}
$ ./factorial 3 > error.txt
Bus error (core dumped)

$ tail error.txt 
number is now -174516 and memory location is 0xbf68ea14
number is now -174517 and memory location is 0xbf68e9e4
number is now -174518 and memory location is 0xbf68e9b4
number is now -174519 and memory location is 0xbf68e984
number is now -174520 and memory location is 0xbf68e954
number is now -174521 and memory location is 0xbf68e924
number is now -174522 and memory location is 0xbf68e8f4
number is now -174523 and memory location is 0xbf68e8c4
number is now -174524 and memory location is 0xbf68e894

Overflowing the stack is called a stack overflow. So at least you now know where the name stackoverflow.com comes from.

There is a program called valgrind that will actually show us what error was encountered:

$ valgrind ./factorial 3
number is now -174555 and memory location is 0xbe233654
==3145== Stack overflow in thread 1: can't grow stack to 0xbe232ffc
==3145== 
==3145== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==3145==  Access not within mapped region at address 0xBE232FFC
==3145==    at 0x40FAB9C: _IO_file_write@@GLIBC_2.1 (fileops.c:1261)
==3145==  If you believe this happened as a result of a stack
==3145==  overflow in your program's main thread (unlikely but
==3145==  possible), you can try to increase the size of the
==3145==  main thread stack using the --main-stacksize= flag.
==3145==  The main thread stack size used in this run was 8388608.
==3145== Stack overflow in thread 1: can't grow stack to 0xbe232ff8
==3145== 
==3145== Process terminating with default action of signal 11 (SIGSEGV)
==3145==  Access not within mapped region at address 0xBE232FF8
==3145==    at 0x4024510: _vgnU_freeres (in /usr/lib/valgrind/vgpreload_core-x86-linux.so)
==3145==  If you believe this happened as a result of a stack
==3145==  overflow in your program's main thread (unlikely but
==3145==  possible), you can try to increase the size of the
==3145==  main thread stack using the --main-stacksize= flag.
==3145==  The main thread stack size used in this run was 8388608.
==3145== 
==3145== HEAP SUMMARY:
==3145==     in use at exit: 0 bytes in 0 blocks
==3145==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==3145== 
==3145== All heap blocks were freed -- no leaks are possible
==3145== 
==3145== For counts of detected and suppressed errors, rerun with: -v
==3145== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)

So there you have it :)