How to simulate "tail recursion" in C #?

15

In .Net, I know that it is possible to make cause calls because the F # compiler, when optimizing code, transforms a function with tail recursion into a function with a loop, thus avoiding stack overflows. To make it clearer, I'll use a factorial function as an example.

In F #:

let rec fatorial_iter n r : int64 =
  if n <= 1L then r else fatorial_iter (n - 1L) (n * r)

let fatorial n = fatorial_iter n 1L

This prevents stack bursts because the stack will always be the same size after optimization. Illustrating a call to fatorial(5) :

fatorial(5, 1)   -> fatorial(5 - 1, 5 * 1)  ->
fatorial(4, 5)   -> fatorial(4 - 1, 4 * 5)  ->
fatorial(3, 20)  -> fatorial(3 - 1, 3 * 20) ->
fatorial(2, 60)  -> fatorial(2 - 1, 2 * 60) ->
fatorial(1, 120) -> 120

If we try to do this in C #, we can also make the recursive call on the tail, as in this example:

static long Fatorial(int n, long r = 1) {
    return n <= 1 ? r : Fatorial(n - 1, n * r);
}

But the C # compiler can not optimize this code (at least not yet, using .Net Framework 4.5).

Is there any way, without losing the immutability of the code, to prevent this function from causing a stack overflow?

    
asked by anonymous 03.02.2014 / 19:46

4 answers

7

As others have mentioned, C # does not give you a guarantee that a tail recursion will be optimized. I'm very fan of tail recursion but I think you're focusing a bit in the wrong direction - I think flexible flow control is a more crucial point of tail recursion than immutability and this changes somewhat the way address this problem.

In a simple loop case with your example, using tail recurs ends up being as mutable and low level as writing your code using gotos. In each step you have to update the accumulator and counter and to say that you will jump back to the beginning of the loop.

  int acc = 1;
  int i = N;
loop:
   if (i >= 0) {
     // finge que pode usar atribuição múltipla estilo Python
     i, acc = i-1, acc*i; goto loop;
   } else {
     return acc;
   }

The only advantage of tail recursion compared to gotos is the assignment of more than one variable in one step and the compiler will warn you if you forget to tell the new value to one of the variables. Anyway, a for loop ends up being more structured and high level, since you only have to take care of the logic to update the product and the updating of the counter and the gotos comes for free. It's a bit similar to programming using a fold instead of tail recursion

int acc = 1;
for (int i = N; i >= 0; i--){
   acc *= i;
}

Only tail recursion is not just for making simple loops in which a function calls itself. The part where tail recursion makes the most difference is when you have more than one mutually recursive function. A forced example is this state machine defined in Haskell:

par 0 = True
par n = impar (n-1)

impar 0 = False
impar m = par (m-1)

The equivalent of this without recursion is a state machine:

  int n;
  int m;
par:
  if (n == 0) {
    return true;
  } else {
    m = n - 1;
    goto impar;
  }
impar:
  if (n == 0) {
    return false;
  } else {
    n = m - 1;
    goto par;
  }

However in this release all functions have to be combined in a single code snippet, which hurts the encapsulation. For example, we need to declare all the parameter variables at the top, which makes it possible to use them in the wrong place.

In addition, tail recursion is also present when we call a function that was passed to us as a parameter. This is equivalent to a computed goto and can not be translated into a static loop.

In these last two cases, the language support for tail recursion is most needed. If you have to do something equivalent, the closest will be to use the standard "trampoline", but it is kind of inefficient and well chatinho to use.

    
03.02.2014 / 21:33
4

This type of optimization is only present in 64-bit JIT in Release mode.

Test your x64 code in Release mode and StackOverflow will no longer occur.

And of course, never rely on JIT optimization. You do not know which framework, platform or architecture your code will run on.

Code tested here:

class Program
{
    static void Main(string[] args)
    {
        Console.WriteLine(Fatorial(2000000));
    }

    static long Fatorial(int n, long r = 1)
    {
        return n <= 1 ? r : Fatorial(n - 1, n * r);
    }
}
    
03.02.2014 / 20:05
1

In this case the only guaranteed way of not causing a StackOverflow would be to use a non-recursive algorithm. For example:

static long Fatorial(int n) 
{
    long r = 1;

    for (; n > 1; n--)
    {
        r *= n;
    }

    return r;
}
    
03.02.2014 / 20:43
0

I may be talking nonsense now, but I believe JCKödel's response is incorrect.

I think you could only compute such a large factorial number just because you're using the 64-bit architecture and not because of any optimization.

NOTE: I had to post as a reply, I can not "comment" on other answers yet, I do not have enough reputation.

    
03.02.2014 / 21:50