Paul Batum

Thursday, 8 January 2009

Refactoring Towards the Y Combinator: Part 3

This is a multi-part series. Part 1. Part 2. Part 3. Part 4. Part 5. Part 6.

I would like to begin this post with an explanation of a technique called currying. Currying is a mechanism that allows a function that takes multiple arguments to be converted to a function that takes only a single argument. I will be using it today, but it would probably be best to look at currying in a simpler context before it is applied to the factorial scenario. I’ve constructed a simple example of currying based on adding two numbers. The inline comments should explain what is going on. Brief disclaimer: currying is another one of those theoretical computer sciency concepts that I’ve understood just enough to get by. If my explanation sucks, please leave a comment so I can fix it.

// The addition function takes two integers, and returns the result of adding them together.
Func<int, int, int> add = (x, y) => x + y;
int five = add(2, 3);

// If you often needed to add two to something, you could explicitly define a function that does so.
// It would only need to take one argument.
Func<int, int> addTwo = x => add(x, 2);
five = addTwo(3);

// Perhaps you also need to add four often. Again, you could make a function for it, and it would only take one argument.
Func<int, int> addFour = x => add(x, 4);
five = addFour(1);

// Now there is some duplication between addTwo and addFour. What if we could make a function, that makes adders?
Func<int, Func<int, int>> makeAdder = (int valueToAdd) =>
{
   return x => add(x, valueToAdd);
};

// Excellent, now we can reconstruct addTwo and addFour using makeAdder
addTwo = makeAdder(2);
addFour = makeAdder(4);

// And use them like before
five = addTwo(3);
five = addFour(1);
          
// Lets compute 6 + 4, by making an adder that adds 6 to numbers, an then passing 4 to it.
int ten = makeAdder(6)(4);

// Well it appears we can now add any two numbers together by using makeAdder. But makeAdder only takes one argument, not two.
// makeAdder is the result of applying currying to the add method.

Here is another way of stating it: if you have a function that takes arguments A and B and returns C, you can create a function that takes argument A, and returns a function that takes argument B and returns C. By calling this function with A, you then call the resulting function with B, and you get C. That is currying. Lets get back to factorials – here is the code the last post finished up with:

public delegate int SelfCallingFactorial(int x, SelfCallingFactorial self);

/// <summary>
/// Creating the factorial method inside the returned lambda resolves the garbage collection problem.
/// This solution does not involve a closure, so its pretty good. But...
/// </summary>
public class Solution03 : ISolution
{
   public Factorial MakeFactorialFunction()
   {
       return (int x1) =>
       {
           SelfCallingFactorial factorial = (int x, SelfCallingFactorial self) =>
           {
               return x < 2 ? 1 : x * self(x - 1, self);
           };

           return factorial(x1, factorial);
       };
   }
}

Its sort of annoying how I have to do all my work inside a lambda. I’ve had to do this for the last few solutions, ever since I introduced the idea of using self calling function. It would be great if the SelfCallingFactorial could return a Factorial (the delegate). That way I could just return the result directly, and I wouldn’t have to deal with the fact that I have two integer parameters running about (x and x1). This change can be achieved by using currying. Its somewhat tricky so I will do it in two steps. The first step is to perform the currying operation, making as few changes as possible, the second step will clean up, simplify and generalize. Here is the first part:

public delegate Factorial SelfCallingFactorial2(SelfCallingFactorial2 self);

/// <summary>
/// Apply a technique called currying. Now the self calling factorial function returns a factorial delegate.
/// </summary>
public class Solution04 : ISolution
{
   public Factorial MakeFactorialFunction()
   {
       return (int x1) =>
       {
           SelfCallingFactorial2 factorial = (SelfCallingFactorial2 self) =>
           {
               return (int x) =>
               {
                   return x < 2 ? 1 : x*self(self)(x-1);
               };
           };

           return factorial(factorial)(x1);
       };
   }
}

I had to modify the SelfCallingFactorial delegate. It no longer accepts an integer, instead it returns a Func<int,int> which I replaced with the Factorial delegate. You will notice I stuck a 2 on the end of the name, this is just to avoid clashes with previous delegate. This will get cleaned up in a second. Once the delegate was modified, I had to change the factorial function, and then its invocation. A side-by-side comparison might help:

Do you see how the sections in the boxes follow the same pattern, where instead of there being two arguments, there is just one argument, and then the result is invoked with the second argument?

Step two of this change is to take advantage of what I gained in step one. Now that the self calling factorial function returns a Factorial delegate, I don’t need to wrap the entire thing in a lambda – I can just return the result directly. I can also change the SelfCallingFactorial2 delegate to be generic. Ideally I could use it for other recursive algorithms such as Fibonacci. While I’m in cleanup mode, I want to rename the delegate to “SelfApplicable”. By doing this, I’ve changed the terminology somewhat – I’m using “applicable” instead of “calling”. I’ve done this because the delegate doesn’t actually “know” what the function is doing with the self argument. Its probably calling it, but perhaps its just inspecting its members? What the delegate can say is that the function is definitely applying itself to itself, because it takes itself as an argument, after all. This change also gets me closer to the code I’m aiming for in Mads’ post. So here is the result of applying these changes:

public delegate T SelfApplicable<T>(SelfApplicable<T> self);

/// <summary>
/// We can generalize the previous delegate to work with any function, and create SelfApplicable<T>.
/// I wish I could constrain T to be a delegate type but unfortunately it is not supported.
///
/// Because we removed the integer argument from the function, we no longer have to nest the entire body in
/// a lambda as we did in the previous step.
/// </summary>
public class Solution05 : ISolution
{
   public Factorial MakeFactorialFunction()
   {
       SelfApplicable<Factorial> factorial = (SelfApplicable<Factorial> self) =>
       {
           return (int x) =>
           {
               return x < 2 ? 1 : x * self(self)(x - 1);
           };
       };
      
       return factorial(factorial);
   }
}

Pretty neat huh! My only complaint with this code is that my factorial algorithm has this yucky self(self) business. Ideally I could refactor this out, so that the factorial algorithm is as readable as possible, and the tricky self calling nonsense is somewhere else. I shall pursue this in the next post.

Tuesday, 6 January 2009

Refactoring Towards the Y Combinator: Part 2

This is a multi-part series. Part 1. Part 2. Part 3. Part 4. Part 5. Part 6.

Lets get started with discovering the Y combinator. To start off, I came up with the most straightforward recursive implementation of the factorial function I could think of, restricting myself to anonymous methods (or more specifically, lambdas):

public delegate int Factorial(int x);

/// <summary>
/// This is the cheating way. Define a variable and use a closure on that variable.
/// Its cheating because it only works in an impure language.
/// </summary>
public class Solution01 : ISolution
{
    public Factorial MakeFactorialFunction()
    {            
        Factorial factorial = null;
        factorial = (int x) =>
        {
            return x < 2 ? 1 : x * factorial(x - 1);
        };
        
        return factorial;
    }
}

I define a delegate for the Factorial function – it takes an int, returns an int. I could have used Func<int,int> but believe me, as this gets more complicated, you’ll be glad I introduced the delegate. I also define a MakeFactorialFunction() method, which is intended to be used as so:

var s1 = new Solution01();
Factorial f = s1.MakeFactorialFunction();
int factorialOf10 = f(10);

You can mostly ignore the Solution01 and ISolution stuff, they are just artefacts from the testing harness I set up. Lets focus on the code that matters:

Factorial factorial = null;
factorial = (int x) =>
{
    return x < 2 ? 1 : x * factorial(x - 1);
};

return factorial;

In this code, I call factorial recursively by utilizing a closure. The body of the lambda expression refers to a variable that lies outside the scope of the lambda; that variable is ‘factorial’. You might have noticed, my code comment above states that I consider this implementation to be “cheating”. This is because I reassign the value of factorial. First it is null, then it is a lambda expression. Perhaps you are wondering why I bother reassign it? Why not just do it like this:

Factorial factorial = (int x) =>
{
    return x < 2 ? 1 : x * factorial(x - 1);
};

I am afraid this won’t work, the compiler identifies this as an error, “use of unassigned local variable ‘factorial’”. This is why I use the extra step of declaring factorial to be null. Now I said before that reassigning the factorial variable is cheating. Why? The Y combinator is much more relevant to the functional programming paradigm than it is to the imperative paradigm – functional programming has a concept called “purity”, which means no side effects, and no variables (no variables with varying values, that is) . For the purposes of this exercise, it will be helpful to limit myself to pure operations. The reassignment of the factorial variable is a impure operation. So how to get rid of it?

My next attempt:

public delegate int SelfCallingFactorial(int x, SelfCallingFactorial self);

/// <summary>
/// We can remove the impurity by having the factorial function accept a reference to itself, so it can call itself.
/// We can't just return this implementation because it doesn't match the Factorial signature, so we have to return
/// a lambda that calls it. Unfortunately, doing so will create a closure and as a result, as long as a reference
/// to the result of MakeFactorialFunction method is held, the Solution2 instance will not be garbage collected.
/// </summary>
public class Solution02 : ISolution
{
    public Factorial MakeFactorialFunction()
    {
        SelfCallingFactorial factorial = (int x, SelfCallingFactorial self) =>
        {
            return x < 2 ? 1 : x * self(x - 1, self);
        };

        return x =>
        {
            return factorial(x, factorial);
        };
    }
}

Okay, so what has happened? Rather than having the factorial function call itself by utilizing the variable, we could pass a reference to the factorial function as an argument to itself. I modify the Factorial delegate to accept an instance of itself, and call it a SelfCallingFactorial instead. You can see that delegate definition at the top. I then define ‘factorial’ similarly as before, but this time I have a ‘self’ parameter which I can invoke as the recursive step. When it is invoked, I have to pass two arguments: the value I want to calculate the factorial of, and another self reference. Try to think of the ‘self’ parameter as a manually implemented equivalent of the ‘this’ keyword in normal C#, but for functions. The ‘this’ keyword allows code inside an object to obtain a reference to that object, similarly, my ‘self’ parameter allows code inside a function to call that function!

The result of this trick is that I have successfully removed the impurity - there is no reassignment in this solution. However we are not finished yet: unfortunately, I cannot simply return ‘factorial’ after it has been defined, because it is a SelfCallingFactorial, and the MakeFactorialFunction method is supposed to return a Factorial. Instead I have to return a lambda that will execute my SelfCallingFactorial. This creates another closure, again, on the ‘factorial’ variable. As mentioned in the code comment, the instance of Solution02 used to create the factorial function will stay in scope as long as the factorial function stays in scope, because the returned factorial function holds a reference to the ‘factorial’ local variable defined in the MakeFactorialFunction() method. Fortunately, this is easy to fix:

/// <summary>
/// Creating the factorial method inside the returned lambda resolves the garbage collection problem. 
/// This solution does not involve a closure, so its pretty good. But...
/// </summary>
public class Solution03 : ISolution
{
    public Factorial MakeFactorialFunction()
    {
        return (int x1) =>
        {
            SelfCallingFactorial factorial = (int x, SelfCallingFactorial self) =>
            {
                return x < 2 ? 1 : x * self(x - 1, self);
            };

            return factorial(x1, factorial);
        };
    }
}

Solution03 is just a very minor revision on Solution02. All I had to do was move the code that defines the SelfCallingFactorial inside the returned lambda expression (if you look at the code above, you will see that it was previously outside). Now when a Solution03 instance is used to construct a FactorialFunction, that instance will be free to be garbage collected while the result, the factorial function, can continued to be used. Now please don’t get me wrong, I know that this garbage collection ‘issue’ is quite irrelevant – I am actually just using it as an excuse to get the code closer to my satisfaction. I am not exactly sure why I want it this way – it just seems neater.

So, where to go from Solution03? It seems pretty good, and we are still nowhere near the Y combinator. In the next instalment, I will start making the code a little more general, with the eventual goal that the same technique might be applied to other recursive algorithms (such as Fibonacci).

Sunday, 4 January 2009

Refactoring Towards the Y Combinator: Part 1

This is a multi-part series. Part 1. Part 2. Part 3. Part 4. Part 5. Part 6.

Sometimes I wish that I had studied computer science rather than software engineering at university. The software engineering degree was much more practical, and as a result there is a bunch of interesting computer science theory out there that I am yet to be exposed to. In an effort to make some tiny progress on this front, this Christmas break I decided to spend a few days exposing myself to a lovely little thing called the Y combinator.

What is the Y combinator exactly? Basically, its a mechanism that can be used to call functions recursively, even if your programming language does not support recursion, as long as it does have support for first class functions (e.g. you can pass one function to another as an argument). It is extremely unlikely that you will find a use for it in your day to day work as a programmer, but I found learning about the Y combinator to be a worthwhile exercise nonetheless simply because it made my head hurt in a good way.

I found three particularly useful resources for learning about the Y combinator. These were:

Recursive Lambda Expressions by Mads Torgersen
The Y Combinator by Mike Vanier
The Why of Y [PDF] by Richard Gabriel

Just reading through them, they all managed to lose me at some point. The Y combinator is really a functional programming concept, so programmers such as myself that were crippled at a young age by being first exposed to an imperative language like Basic will really struggle with it. It was clear that the only way I was really going to understand it was roll up my sleeves and write some code.

Mads’ post uses C#, and it had enough code to give me a clear goal. I would start with the most basic implementation of a recursive method I could, and try to slowly refactor it towards the final Mads code. This turned out to work very well, but I probably would not have succeeded without the help of the other resources, especially The Why of Y. Both Mike and Richard use Scheme (a dialect of Lisp), but fortunately I have watched enough SICP lectures to be able to follow it at a basic level. In the Why of Y, Richard provides an example derivation of Y, and I was able to successfully translate each of his steps to the appropriate C#. I promise you, there will be no Lisp code in any of the posts in this series.

So what does the Y combinator look like? Well here is the definition I ended up with:

SelfApplicable<HigherOrderEvaluator<Func<T, T>>> Y =
   self => algorithm => t => algorithm(self(self)(algorithm))(t);

HigherOrderEvaluator<Func<T, T>> ApplyY = Y(Y);

Here is my definition for the factorial function:

HigherOrderFunction<Func<int, int>> factorial =
               f => x => x < 2 ? 1 : x * f(x - 1);

And here is the usage:

public int CalculateFactorialOf10()
{
   return YCombinator<int>.ApplyY(factorial)(10);
}

What’s the phrase you’re looking for? How about: “My eyes! The goggles do nothing!”

Even now when I look at it, I struggle to make sense of it. This is the compressed version, but in my opinion even a far more verbose version is still difficult to comprehend. I understand this code by understanding the steps I followed to arrive at it, and hopefully by the end of this series of posts, so will you.

One Year

Its been one year since I started this blog, so I figured a reflective post might be appropriate.

What went right in 2008:

I made a dramatic change to my free time habits. I did significantly more reading and programming, and spent much less time playing videogames.
I read a bunch of software related books cover-to-cover. Probably more than in all previous years combined.
I involved myself in an open source project.

What went wrong:

I started many things, finished few. I played with lots of stuff but didn’t deliver a whole lot.
I didn’t learn javascript or ruby. At least, not to the point where I can write idiomatic code in either language.
I failed to deliver a worthwhile MVC based web application.
I made very little progress in the area of algorithms and data structures.

Rereading my posts from a year ago, it is quite evident that my year did not proceed quite as I intended. However, I am not disappointed - I learned quite a bit about my own capabilities and study habits. My career goal changed – while Google is very alluring, I recognise that Microsoft would probably be an awesome place for me to work and a better fit. I have concluded that my focus on the Microsoft stack is not a bad thing. From my perspective, there are lots of exciting things in the works for the Microsoft stack:

Microsoft will be, for the first time ever, providing and fully supporting a proper functional programming language: F#.
The introduction of the DLR promises to foster a healthy ecosystem of dynamic programming languages, with IronPython and IronRuby at the forefront.
Spec# has been transformed into a code contract library with compiler support and will be part of the .NET 4.0 release.
Windows Azure, Microsoft’s cloud computing platform, could make a significant impact on the .NET development ecosystem over the next few years.
Oslo, simply because it could help push domain specific languages into the mainstream. I am sceptical about much of what Oslo promises, but I think better tooling and raising awareness of what DSLs can offer is a good thing.

In the next few years I will be able to explore functional languages, dynamically typed languages, code contracts, cloud computing and domain specific languages all within the .NET platform. There is an immense amount for me to learn (this list, obviously, is nowhere near enumerating all of it), so why not go with the path of least resistance and learn what I can within the MS stack?

I hope to do a better job this year of actually finishing things. My study habits are still a long way off being optimal.

Tuesday, 9 December 2008

A Fluent NHibernate rewrite

A little less than 6 months ago I joined the Fluent NHibernate open source project. More recently I’ve struggled with fielding feature requests and partial patches, and I believe that the source of my difficulty has been the structure of the Fluent NHibernate codebase. In my opinion, the fundamental problem with the code is a lack of separation of concerns. Lets take a look at some excerpts from the ClassMap class:

public virtual IdentityPart Id(Expression<Func<T, object>> expression, string column)
{
    PropertyInfo property = ReflectionHelper.GetProperty(expression);
    var id = column == null ? new IdentityPart(property) : new IdentityPart(property, column);
    AddPart(id);
    return id;
}

This Id method is part of the fluent interface. You can see it takes an expression and does some reflection to convert that expression into a format that can be used to configure NHibernate.

public XmlDocument CreateMapping(IMappingVisitor visitor)
{
    if (String.IsNullOrEmpty(TableName))
        TableName = visitor.Conventions.GetTableName.Invoke(typeof(T));

    visitor.CurrentType = typeof(T);
    XmlDocument document = getBaseDocument();
    setHeaderValues(visitor, document);

    foreach (var import in imports)
    {
        import.Write(document.DocumentElement, visitor);
    }

    XmlElement classElement = createClassValues(document, document.DocumentElement);
    writeTheParts(classElement, visitor);

    return document;
}

This CreateMapping method generates the eventual output that will be fed to NHibernate: an XmlDocument.

The fact that these two methods exist in the same class is a prime example of why I have found the code base increasingly difficult to work with. There is no seperation between the function of specifying the mapping (by calling into the fluent interface) and generating the mapping (writing an xml document). It becomes difficult to reason about the capability of the library as the public interface consists of a big pile of lambda goo. Thinking more ambitiously, if you wanted to develop an alternative fluent api while continuing to utilize all the existing infrastructure provided by Fluent NHibernate, you would find it very difficult indeed.

So, what to do? Hell, if your a programmer you shouldn’t have to ask that question! The same thing we always want to do when the old code base starts to feel clunky. Throw it all out and start from scratch! Ok, rarely is this actually a good idea but that is the nice thing about doing open source on your own time – you can do whatever the hell you want.

Well what should the code base actually look like? Chad Myers had the answer, so I rolled up my sleeves and went to work. That was a couple of weeks ago. Today I hit a milestone - I was able to successfully configure NHibernate for a simple 3 entity scenario, using an API that looks just like the existing Fluent NHibernate API but actually utilizes a model that constructs an object graph of Hbm* instances (these are classes provided by NHibernate that are generated from the mapping schema) and then serializes that graph to xml. Hitting this milestone meant that it was time to share my work, so I committed my version as a branch. I included a decent chunk of notes in the commit message and don’t intend to repeat them all here, so I recommend you review the log if you would like to know more.

So what does this rewrite actually look like?

First of all, here is the tiny demonstration domain that I have been using so far:

Now here is the fluent mapping that I managed to get working today:

private class ArtistMap : ClassMap<Artist>
{
    public ArtistMap()
    {
        Id(x => x.ID);
        Map(x => x.Name)
            .WithLengthOf(50)
            .CanNotBeNull();
        HasMany<Album>(x => x.Albums)
            .AsSet()
            .IsInverse();
    }
}

private class AlbumMap : ClassMap<Album>
{
    public AlbumMap()
    {
        Id(x => x.ID);
        Map(x => x.Title)
            .WithLengthOf(50)
            .CanNotBeNull();
        References(x => x.Artist);
        HasMany<Track>(x => x.Tracks)
            .AsSet()
            .IsInverse();
    }
}

private class TrackMap : ClassMap<Track>
{
    public TrackMap()
    {
        Id(x => x.ID);
        Map(x => x.Name)
            .WithLengthOf(50)
            .CanNotBeNull();
        Map(x => x.TrackNumber);
        References(x => x.Album)
            .CanNotBeNull();
    }
}

Look familiar? It should if you are a Fluent NHibernate user or developer! To demonstrate the flexibility of the rewrite, I implemented a very small subset of the existing fluent interface. That implementation was put in a completely separate assembly, to demonstrate that the core of Fluent NHibernate is not tied to any specific fluent api. If you wanted to, you could map a class unfluently, using the core model like so: (this example only maps Album)

var root = new HibernateMapping();

var classMapping = new ClassMapping
   {
       Name = typeof (Album).AssemblyQualifiedName,
       Id = new IdMapping("ID", new IdColumnMapping("ID"), IdGeneratorMapping.NativeGenerator)
   };
root.AddClass(classMapping);

var titleMapping = new PropertyMapping("Title") {Length = 50, AllowNull = false};
classMapping.AddProperty(titleMapping);

var artistMapping = new ManyToOneMapping("Artist");
classMapping.AddReference(artistMapping);

var trackMapping = new SetMapping
   {
       Name = "Tracks",
       Key = new KeyMapping(),
       Contents = new OneToManyMapping(typeof (Track).AssemblyQualifiedName)
   };
classMapping.AddCollection(trackMapping);

You could then serialize the underlying Hbm graph and this is the result:

But of course, doing it that way sucks. This project is called Fluent NHibernate after all! You don’t even have any strong typing on the properties! I can’t see anyone using the core model directly. Its designed to be a base for fluent api’s to build on.

The core model is still in a rough state. At the moment it has very little functionality other than to wrap the Hbm* classes. However I expect this will change. One of the many issues that need addressing is that of support for conventions – I expect the conventions implementation will require the model to be richer, to accept Type and PropertyInfo instances rather than awful strings. But don’t worry! It won’t get too rich – you won’t see any Expression<Func<T, Object>> hanging off the model – that stuff belongs in the fluent interface.

I am really quite excited about how the seperation between the mapping model and the fluent interface could be leveraged. The DDD fanboi in me would love to see a fluent interface that allows you to describe your aggregate structure in addition to mapping to NHibernate.

Monday, 3 November 2008

No visitors!

I am sure that most programmers managed to learn some specific design patterns before they even knew what design patterns were. This is certainly true for myself – and while I cannot say what my first design pattern was, I can tell you about the first one I remember learning: the Visitor pattern. I was midway through a university class on object oriented programming and in my spare time I was writing a turn based puzzle game in C# that involved daggers and goblins and dragons and doors and keys and whatnot. Now to make the thing work, I was doing something that felt very wrong. I was writing lots of switch statements that described the interactions between the different objects. I experimented, looking for better solutions using overloading and overriding, but none of it seemed to work. I asked my lecturer for help, and of course this is where I learned about visitor. Initially I was elated, but in time my opinion changed: visitor is this ugly bloated thing, designed to get around the lack of double dispatch support in most object oriented languages. Its a hack, a workaround. As with arguably all design patterns, its a trick to deal with limitations in the language.

So today I was watching this video that goes into more detail on the new C#4 features and something was said about runtime overload resolution that flicked a switch in my brain and suddenly I realised a fact that I am sure was mindblowingly obvious to people much smarter than I: the C#4 dynamic keyword supports double dispatch. If you are prepared to take the performance hit and just want to get something up and running, you can forget the visitor pattern. Lets look at an example based on the sort of thing I was trying to do years ago with my little game:

class Entity
{
    public string Name { get; set; }
    public string GetGreeting(Entity e)
    {
        return "...";
    }
}
class Foo : Entity
{
    public string GetGreeting(Foo f)
    {
        return string.Format("Hi there {0}! Beware, there may be evil Bars nearby!", f.Name);
    }

    public string GetGreeting(Bar b)
    {
        return "Curse you! I hate all Bars!";
    }
}
class Bar : Entity
{
    public string GetGreeting(Bar b)
    {
        return string.Format("Greetings {0}! Have you seen any evil Foos lately?", b.Name);
    }

    public string GetGreeting(Foo f)
    {
        return "The Foos are a scourge on this planet! Take this!";
    }
}
class StaticIntroducer
{
    public void Introduce(Entity e1, Entity e2)
    {
        Console.WriteLine("{0} is introduced to {1} by the StaticIntroducer...", e1.Name, e2.Name);
        Console.WriteLine("{0} says: \"{1}\"", e1.Name, e1.GetGreeting(e2));
        Console.WriteLine("{0} says: \"{1}\"", e2.Name, e2.GetGreeting(e1));
    }
}
class Program
{
    static void Main(string[] args)
    {
        Entity f1 = new Foo { Name = "BobFoo" };
        Entity f2 = new Foo { Name = "JohnFoo" };
        Entity b1 = new Bar { Name = "JerryBar" };
        Entity b2 = new Bar { Name = "SamBar" };

        var introducer = new StaticIntroducer();
        introducer.Introduce(f1, f2);
        introducer.Introduce(f1, b1);
        introducer.Introduce(b1, b2);

        Console.ReadLine();
    }
}

It outputs the following:

BobFoo is introduced to JohnFoo by the StaticIntroducer... BobFoo says: "..." JohnFoo says: "..." BobFoo is introduced to JerryBar by the StaticIntroducer... BobFoo says: "..." JerryBar says: "..." JerryBar is introduced to SamBar by the StaticIntroducer... JerryBar says: "..." SamBar says: "..."

Clearly, it does not work as intended. The code I’ve written makes some sort of sense, but its not correct. If you look at the Introduce method, you can see that it takes two entities. As a result, the call to GetGreeting will be resolved at compile time to Entity.GetGreeting(Entity e) and no polymorphism will occur. You cannot fix the problem by simply making the methods virtual and overrides, as the overriden versions take different arguments. Redeclaring the GetGreeting methods using the new operator also does not help. If you are facing this problem with C# 3 or earlier, its time to apply the visitor pattern. I won’t do that here, there are plenty of examples of Visitor on the web.

But if you are using C#4, its time for dynamic to come to the rescue. Here is a slightly different implementation of an Introducer:

class DynamicIntroducer
{
    public void Introduce(dynamic e1, dynamic e2)
    {
        Console.WriteLine("{0} is introduced to {1} by the DynamicIntroducer...", e1.Name, e2.Name);
        Console.WriteLine("{0} says: {1}", e1.Name, e1.GetGreeting(e2));
        Console.WriteLine("{0} says: {1}", e2.Name, e2.GetGreeting(e1));
    }
}

The ONLY thing I have changed is the signature. The parameters are now typed as dynamic rather than Entity. I plug this introducer in and this is the output:

BobFoo is introduced to JohnFoo by the DynamicIntroducer... BobFoo says: Hi there JohnFoo! Beware, there may be evil Bars nearby! JohnFoo says: Hi there BobFoo! Beware, there may be evil Bars nearby! BobFoo is introduced to JerryBar by the DynamicIntroducer... BobFoo says: Curse you! I hate all Bars! JerryBar says: The Foos are a scourge on this planet! Take this! JerryBar is introduced to SamBar by the DynamicIntroducer... JerryBar says: Greetings SamBar! Have you seen any evil Foos lately? SamBar says: Greetings JerryBar! Have you seen any evil Foos lately?

It works! How? Well, once you are using the dynamic type, overload resolution occurs at runtime rather than compile time, and is done using the runtime types of the arguments rather than the compile time types. My nice, succinct code that describes the different interactions now works just how I intended. But at what cost? Putting the possible performance problems aside, we have made the code more fragile. If someone passes a non-Entity to the Introduce method, we’ll get a runtime error! Well this is actually a trivial fix that only occurred to me while writing this post:

public void Introduce(Entity e1, Entity e2)
{
    Console.WriteLine("{0} is introduced to {1} by the DynamicIntroducer...", e1.Name, e2.Name);
    Console.WriteLine("{0} says: {1}", e1.Name, ((dynamic)e1).GetGreeting(e2));
    Console.WriteLine("{0} says: {1}", e2.Name, ((dynamic)e2).GetGreeting(e1));
}

The only method call that actually needs to be dynamic is GetGreeting, so we can change the method signature back to as it was before and just introduce a cast to dynamic for the GetGreeting invocation. Much better. There is one more thing I would like to try to do: hide the need for GetGreeting to be invoked dynamically from the caller. The crude way is to use a static method, but I’m not a fan of that. Tomorrow I will see if there is an elegant way to approach it.

More C# 4.0 goodness: named parameters

So about two months ago I responded to a post on Jimmy Bogard's blog. Jimmy was lamenting wasting a bunch of time by accidently calling a library method with his arguments reversed (an annoyance that I am sure most programmers are familiar with) and I wrote the following:

This is a good example of why I wish C# supported named parameters rather than ordered parameters. Regex.IsMatch(pattern: @"^\d*$", input: "123456" )
Is so much clearer.

The cool thing is that I just pasted my line of code directly into a C#4 console application (I’m using the VS2010 CTP) and it compiled first time! I'm a fan of named parameters, simply because they make code more readable. If I working with some existing code, I want to understand what it is doing as quickly as possible, and using named parameters will help that cause.

Lack of named parameters in the past has caused the BCL team to create types such as System.IO.SearchOption - an enum with two values: AllDirectories and TopDirectoryOnly. Now let me be clear, I am not at all saying that types like this one are a bad idea. The problem, pre C#4, is that the more readable solution involved pulling your thumb out and writing more code, such as this enum. The BCL programmers could have been lazy and just made the Directory.GetDirectories() method accept a boolean 'recursive' parameter, but fortunately for us they weren't lazy and so they created an enum to make for more readable code. The good news is that with C#4, programmers such as myself can be lazy and leave the method signature using a boolean, safe in the knowledge that the caller can make a readable call by using a named parameter.

The downside to this approach is that rather than FORCING the caller to write readable code, you are HOPING they will elect to write the more readable option. So what is better? Well this is touching on a much bigger subject, the whole issue of how much trust you should place on the individual developer. I can’t give you a definitive answer for that, it depends on the environment and developer culture. Perhaps it would suffice to say that I would prefer to work in an environment where I am safe to force rarely, and hope often.

Paul Batum

Thursday, 8 January 2009

Refactoring Towards the Y Combinator: Part 3

Tuesday, 6 January 2009

Refactoring Towards the Y Combinator: Part 2

Sunday, 4 January 2009

Refactoring Towards the Y Combinator: Part 1

One Year

Tuesday, 9 December 2008

A Fluent NHibernate rewrite

Monday, 3 November 2008

No visitors!

More C# 4.0 goodness: named parameters

Code

Subscribe

About Me

My Twitter Updates

Blog Archive

Worth Reading

Labels