Thursday 8 January 2009

Refactoring Towards the Y Combinator: Part 3

This is a multi-part series. Part 1. Part 2. Part 3. Part 4. Part 5. Part 6.

I would like to begin this post with an explanation of a technique called currying. Currying is a mechanism that allows a function that takes multiple arguments to be converted to a function that takes only a single argument. I will be using it today, but it would probably be best to look at currying in a simpler context before it is applied to the factorial scenario. I’ve constructed a simple example of currying based on adding two numbers. The inline comments should explain what is going on. Brief disclaimer: currying is another one of those theoretical computer sciency concepts that I’ve understood just enough to get by. If my explanation sucks, please leave a comment so I can fix it.

// The addition function takes two integers, and returns the result of adding them together.
Func<int, int, int> add = (x, y) => x + y;
int five = add(2, 3);

// If you often needed to add two to something, you could explicitly define a function that does so.
// It would only need to take one argument.
Func<int, int> addTwo = x => add(x, 2);
five = addTwo(3);

// Perhaps you also need to add four often. Again, you could make a function for it, and it would only take one argument.
Func<int, int> addFour = x => add(x, 4);
five = addFour(1);

// Now there is some duplication between addTwo and addFour. What if we could make a function, that makes adders?
Func<int, Func<int, int>> makeAdder = (int valueToAdd) =>
{
   return x => add(x, valueToAdd);
};

// Excellent, now we can reconstruct addTwo and addFour using makeAdder
addTwo = makeAdder(2);
addFour = makeAdder(4);

// And use them like before
five = addTwo(3);
five = addFour(1);
          
// Lets compute 6 + 4, by making an adder that adds 6 to numbers, an then passing 4 to it.
int ten = makeAdder(6)(4);

// Well it appears we can now add any two numbers together by using makeAdder. But makeAdder only takes one argument, not two.
// makeAdder is the result of applying currying to the add method.

Here is another way of stating it: if you have a function that takes arguments A and B and returns C, you can create a function that takes argument A, and returns a function that takes argument B and returns C. By calling this function with A, you then call the resulting function with B, and you get C. That is currying. Lets get back to factorials – here is the code the last post finished up with:

public delegate int SelfCallingFactorial(int x, SelfCallingFactorial self);

/// <summary>
/// Creating the factorial method inside the returned lambda resolves the garbage collection problem.
/// This solution does not involve a closure, so its pretty good. But...
/// </summary>
public class Solution03 : ISolution
{
   public Factorial MakeFactorialFunction()
   {
       return (int x1) =>
       {
           SelfCallingFactorial factorial = (int x, SelfCallingFactorial self) =>
           {
               return x < 2 ? 1 : x * self(x - 1, self);
           };

           return factorial(x1, factorial);
       };
   }
}

Its sort of annoying how I have to do all my work inside a lambda. I’ve had to do this for the last few solutions, ever since I introduced the idea of using self calling function. It would be great if the SelfCallingFactorial could return a Factorial (the delegate). That way I could just return the result directly, and I wouldn’t have to deal with the fact that I have two integer parameters running about (x and x1). This change can be achieved by using currying. Its somewhat tricky so I will do it in two steps. The first step is to perform the currying operation, making as few changes as possible, the second step will clean up, simplify and generalize. Here is the first part:

public delegate Factorial SelfCallingFactorial2(SelfCallingFactorial2 self);

/// <summary>
/// Apply a technique called currying. Now the self calling factorial function returns a factorial delegate.
/// </summary>
public class Solution04 : ISolution
{
   public Factorial MakeFactorialFunction()
   {
       return (int x1) =>
       {
           SelfCallingFactorial2 factorial = (SelfCallingFactorial2 self) =>
           {
               return (int x) =>
               {
                   return x < 2 ? 1 : x*self(self)(x-1);
               };
           };

           return factorial(factorial)(x1);
       };
   }
}

I had to modify the SelfCallingFactorial delegate. It no longer accepts an integer, instead it returns a Func<int,int> which I replaced with the Factorial delegate. You will notice I stuck a 2 on the end of the name, this is just to avoid clashes with previous delegate. This will get cleaned up in a second. Once the delegate was modified, I had to change the factorial function, and then its invocation. A side-by-side comparison might help:

image

Do you see how the sections in the boxes follow the same pattern, where instead of there being two arguments, there is just one argument, and then the result is invoked with the second argument?

Step two of this change is to take advantage of what I gained in step one. Now that the self calling factorial function returns a Factorial delegate, I don’t need to wrap the entire thing in a lambda – I can just return the result directly. I can also change the SelfCallingFactorial2 delegate to be generic. Ideally I could use it for other recursive algorithms such as Fibonacci. While I’m in cleanup mode, I want to rename the delegate to “SelfApplicable”. By doing this, I’ve changed the terminology somewhat – I’m using “applicable” instead of “calling”. I’ve done this because the delegate doesn’t actually “know” what the function is doing with the self argument. Its probably calling it, but perhaps its just inspecting its members? What the delegate can say is that the function is definitely applying itself to itself, because it takes itself as an argument, after all. This change also gets me closer to the code I’m aiming for in Mads’ post. So here is the result of applying these changes:

public delegate T SelfApplicable<T>(SelfApplicable<T> self);

/// <summary>
/// We can generalize the previous delegate to work with any function, and create SelfApplicable<T>.
/// I wish I could constrain T to be a delegate type but unfortunately it is not supported.
///
/// Because we removed the integer argument from the function, we no longer have to nest the entire body in
/// a lambda as we did in the previous step.
/// </summary>
public class Solution05 : ISolution
{
   public Factorial MakeFactorialFunction()
   {
       SelfApplicable<Factorial> factorial = (SelfApplicable<Factorial> self) =>
       {
           return (int x) =>
           {
               return x < 2 ? 1 : x * self(self)(x - 1);
           };
       };
      
       return factorial(factorial);
   }
}

Pretty neat huh! My only complaint with this code is that my factorial algorithm has this yucky self(self) business. Ideally I could refactor this out, so that the factorial algorithm is as readable as possible, and the tricky self calling nonsense is somewhere else. I shall pursue this in the next post.

No comments:

Post a Comment