Paul Batum: 2008

Tuesday, 9 December 2008

A Fluent NHibernate rewrite

A little less than 6 months ago I joined the Fluent NHibernate open source project. More recently I’ve struggled with fielding feature requests and partial patches, and I believe that the source of my difficulty has been the structure of the Fluent NHibernate codebase. In my opinion, the fundamental problem with the code is a lack of separation of concerns. Lets take a look at some excerpts from the ClassMap class:

public virtual IdentityPart Id(Expression<Func<T, object>> expression, string column)
{
    PropertyInfo property = ReflectionHelper.GetProperty(expression);
    var id = column == null ? new IdentityPart(property) : new IdentityPart(property, column);
    AddPart(id);
    return id;
}

This Id method is part of the fluent interface. You can see it takes an expression and does some reflection to convert that expression into a format that can be used to configure NHibernate.

public XmlDocument CreateMapping(IMappingVisitor visitor)
{
    if (String.IsNullOrEmpty(TableName))
        TableName = visitor.Conventions.GetTableName.Invoke(typeof(T));

    visitor.CurrentType = typeof(T);
    XmlDocument document = getBaseDocument();
    setHeaderValues(visitor, document);

    foreach (var import in imports)
    {
        import.Write(document.DocumentElement, visitor);
    }

    XmlElement classElement = createClassValues(document, document.DocumentElement);
    writeTheParts(classElement, visitor);

    return document;
}

This CreateMapping method generates the eventual output that will be fed to NHibernate: an XmlDocument.

The fact that these two methods exist in the same class is a prime example of why I have found the code base increasingly difficult to work with. There is no seperation between the function of specifying the mapping (by calling into the fluent interface) and generating the mapping (writing an xml document). It becomes difficult to reason about the capability of the library as the public interface consists of a big pile of lambda goo. Thinking more ambitiously, if you wanted to develop an alternative fluent api while continuing to utilize all the existing infrastructure provided by Fluent NHibernate, you would find it very difficult indeed.

So, what to do? Hell, if your a programmer you shouldn’t have to ask that question! The same thing we always want to do when the old code base starts to feel clunky. Throw it all out and start from scratch! Ok, rarely is this actually a good idea but that is the nice thing about doing open source on your own time – you can do whatever the hell you want.

Well what should the code base actually look like? Chad Myers had the answer, so I rolled up my sleeves and went to work. That was a couple of weeks ago. Today I hit a milestone - I was able to successfully configure NHibernate for a simple 3 entity scenario, using an API that looks just like the existing Fluent NHibernate API but actually utilizes a model that constructs an object graph of Hbm* instances (these are classes provided by NHibernate that are generated from the mapping schema) and then serializes that graph to xml. Hitting this milestone meant that it was time to share my work, so I committed my version as a branch. I included a decent chunk of notes in the commit message and don’t intend to repeat them all here, so I recommend you review the log if you would like to know more.

So what does this rewrite actually look like?

First of all, here is the tiny demonstration domain that I have been using so far:

Now here is the fluent mapping that I managed to get working today:

private class ArtistMap : ClassMap<Artist>
{
    public ArtistMap()
    {
        Id(x => x.ID);
        Map(x => x.Name)
            .WithLengthOf(50)
            .CanNotBeNull();
        HasMany<Album>(x => x.Albums)
            .AsSet()
            .IsInverse();
    }
}

private class AlbumMap : ClassMap<Album>
{
    public AlbumMap()
    {
        Id(x => x.ID);
        Map(x => x.Title)
            .WithLengthOf(50)
            .CanNotBeNull();
        References(x => x.Artist);
        HasMany<Track>(x => x.Tracks)
            .AsSet()
            .IsInverse();
    }
}

private class TrackMap : ClassMap<Track>
{
    public TrackMap()
    {
        Id(x => x.ID);
        Map(x => x.Name)
            .WithLengthOf(50)
            .CanNotBeNull();
        Map(x => x.TrackNumber);
        References(x => x.Album)
            .CanNotBeNull();
    }
}

Look familiar? It should if you are a Fluent NHibernate user or developer! To demonstrate the flexibility of the rewrite, I implemented a very small subset of the existing fluent interface. That implementation was put in a completely separate assembly, to demonstrate that the core of Fluent NHibernate is not tied to any specific fluent api. If you wanted to, you could map a class unfluently, using the core model like so: (this example only maps Album)

var root = new HibernateMapping();

var classMapping = new ClassMapping
   {
       Name = typeof (Album).AssemblyQualifiedName,
       Id = new IdMapping("ID", new IdColumnMapping("ID"), IdGeneratorMapping.NativeGenerator)
   };
root.AddClass(classMapping);

var titleMapping = new PropertyMapping("Title") {Length = 50, AllowNull = false};
classMapping.AddProperty(titleMapping);

var artistMapping = new ManyToOneMapping("Artist");
classMapping.AddReference(artistMapping);

var trackMapping = new SetMapping
   {
       Name = "Tracks",
       Key = new KeyMapping(),
       Contents = new OneToManyMapping(typeof (Track).AssemblyQualifiedName)
   };
classMapping.AddCollection(trackMapping);

You could then serialize the underlying Hbm graph and this is the result:

But of course, doing it that way sucks. This project is called Fluent NHibernate after all! You don’t even have any strong typing on the properties! I can’t see anyone using the core model directly. Its designed to be a base for fluent api’s to build on.

The core model is still in a rough state. At the moment it has very little functionality other than to wrap the Hbm* classes. However I expect this will change. One of the many issues that need addressing is that of support for conventions – I expect the conventions implementation will require the model to be richer, to accept Type and PropertyInfo instances rather than awful strings. But don’t worry! It won’t get too rich – you won’t see any Expression<Func<T, Object>> hanging off the model – that stuff belongs in the fluent interface.

I am really quite excited about how the seperation between the mapping model and the fluent interface could be leveraged. The DDD fanboi in me would love to see a fluent interface that allows you to describe your aggregate structure in addition to mapping to NHibernate.

Monday, 3 November 2008

No visitors!

I am sure that most programmers managed to learn some specific design patterns before they even knew what design patterns were. This is certainly true for myself – and while I cannot say what my first design pattern was, I can tell you about the first one I remember learning: the Visitor pattern. I was midway through a university class on object oriented programming and in my spare time I was writing a turn based puzzle game in C# that involved daggers and goblins and dragons and doors and keys and whatnot. Now to make the thing work, I was doing something that felt very wrong. I was writing lots of switch statements that described the interactions between the different objects. I experimented, looking for better solutions using overloading and overriding, but none of it seemed to work. I asked my lecturer for help, and of course this is where I learned about visitor. Initially I was elated, but in time my opinion changed: visitor is this ugly bloated thing, designed to get around the lack of double dispatch support in most object oriented languages. Its a hack, a workaround. As with arguably all design patterns, its a trick to deal with limitations in the language.

So today I was watching this video that goes into more detail on the new C#4 features and something was said about runtime overload resolution that flicked a switch in my brain and suddenly I realised a fact that I am sure was mindblowingly obvious to people much smarter than I: the C#4 dynamic keyword supports double dispatch. If you are prepared to take the performance hit and just want to get something up and running, you can forget the visitor pattern. Lets look at an example based on the sort of thing I was trying to do years ago with my little game:

class Entity
{
    public string Name { get; set; }
    public string GetGreeting(Entity e)
    {
        return "...";
    }
}
class Foo : Entity
{
    public string GetGreeting(Foo f)
    {
        return string.Format("Hi there {0}! Beware, there may be evil Bars nearby!", f.Name);
    }

    public string GetGreeting(Bar b)
    {
        return "Curse you! I hate all Bars!";
    }
}
class Bar : Entity
{
    public string GetGreeting(Bar b)
    {
        return string.Format("Greetings {0}! Have you seen any evil Foos lately?", b.Name);
    }

    public string GetGreeting(Foo f)
    {
        return "The Foos are a scourge on this planet! Take this!";
    }
}
class StaticIntroducer
{
    public void Introduce(Entity e1, Entity e2)
    {
        Console.WriteLine("{0} is introduced to {1} by the StaticIntroducer...", e1.Name, e2.Name);
        Console.WriteLine("{0} says: \"{1}\"", e1.Name, e1.GetGreeting(e2));
        Console.WriteLine("{0} says: \"{1}\"", e2.Name, e2.GetGreeting(e1));
    }
}
class Program
{
    static void Main(string[] args)
    {
        Entity f1 = new Foo { Name = "BobFoo" };
        Entity f2 = new Foo { Name = "JohnFoo" };
        Entity b1 = new Bar { Name = "JerryBar" };
        Entity b2 = new Bar { Name = "SamBar" };

        var introducer = new StaticIntroducer();
        introducer.Introduce(f1, f2);
        introducer.Introduce(f1, b1);
        introducer.Introduce(b1, b2);

        Console.ReadLine();
    }
}

It outputs the following:

BobFoo is introduced to JohnFoo by the StaticIntroducer... BobFoo says: "..." JohnFoo says: "..." BobFoo is introduced to JerryBar by the StaticIntroducer... BobFoo says: "..." JerryBar says: "..." JerryBar is introduced to SamBar by the StaticIntroducer... JerryBar says: "..." SamBar says: "..."

Clearly, it does not work as intended. The code I’ve written makes some sort of sense, but its not correct. If you look at the Introduce method, you can see that it takes two entities. As a result, the call to GetGreeting will be resolved at compile time to Entity.GetGreeting(Entity e) and no polymorphism will occur. You cannot fix the problem by simply making the methods virtual and overrides, as the overriden versions take different arguments. Redeclaring the GetGreeting methods using the new operator also does not help. If you are facing this problem with C# 3 or earlier, its time to apply the visitor pattern. I won’t do that here, there are plenty of examples of Visitor on the web.

But if you are using C#4, its time for dynamic to come to the rescue. Here is a slightly different implementation of an Introducer:

class DynamicIntroducer
{
    public void Introduce(dynamic e1, dynamic e2)
    {
        Console.WriteLine("{0} is introduced to {1} by the DynamicIntroducer...", e1.Name, e2.Name);
        Console.WriteLine("{0} says: {1}", e1.Name, e1.GetGreeting(e2));
        Console.WriteLine("{0} says: {1}", e2.Name, e2.GetGreeting(e1));
    }
}

The ONLY thing I have changed is the signature. The parameters are now typed as dynamic rather than Entity. I plug this introducer in and this is the output:

BobFoo is introduced to JohnFoo by the DynamicIntroducer... BobFoo says: Hi there JohnFoo! Beware, there may be evil Bars nearby! JohnFoo says: Hi there BobFoo! Beware, there may be evil Bars nearby! BobFoo is introduced to JerryBar by the DynamicIntroducer... BobFoo says: Curse you! I hate all Bars! JerryBar says: The Foos are a scourge on this planet! Take this! JerryBar is introduced to SamBar by the DynamicIntroducer... JerryBar says: Greetings SamBar! Have you seen any evil Foos lately? SamBar says: Greetings JerryBar! Have you seen any evil Foos lately?

It works! How? Well, once you are using the dynamic type, overload resolution occurs at runtime rather than compile time, and is done using the runtime types of the arguments rather than the compile time types. My nice, succinct code that describes the different interactions now works just how I intended. But at what cost? Putting the possible performance problems aside, we have made the code more fragile. If someone passes a non-Entity to the Introduce method, we’ll get a runtime error! Well this is actually a trivial fix that only occurred to me while writing this post:

public void Introduce(Entity e1, Entity e2)
{
    Console.WriteLine("{0} is introduced to {1} by the DynamicIntroducer...", e1.Name, e2.Name);
    Console.WriteLine("{0} says: {1}", e1.Name, ((dynamic)e1).GetGreeting(e2));
    Console.WriteLine("{0} says: {1}", e2.Name, ((dynamic)e2).GetGreeting(e1));
}

The only method call that actually needs to be dynamic is GetGreeting, so we can change the method signature back to as it was before and just introduce a cast to dynamic for the GetGreeting invocation. Much better. There is one more thing I would like to try to do: hide the need for GetGreeting to be invoked dynamically from the caller. The crude way is to use a static method, but I’m not a fan of that. Tomorrow I will see if there is an elegant way to approach it.

More C# 4.0 goodness: named parameters

So about two months ago I responded to a post on Jimmy Bogard's blog. Jimmy was lamenting wasting a bunch of time by accidently calling a library method with his arguments reversed (an annoyance that I am sure most programmers are familiar with) and I wrote the following:

This is a good example of why I wish C# supported named parameters rather than ordered parameters. Regex.IsMatch(pattern: @"^\d*$", input: "123456" )
Is so much clearer.

The cool thing is that I just pasted my line of code directly into a C#4 console application (I’m using the VS2010 CTP) and it compiled first time! I'm a fan of named parameters, simply because they make code more readable. If I working with some existing code, I want to understand what it is doing as quickly as possible, and using named parameters will help that cause.

Lack of named parameters in the past has caused the BCL team to create types such as System.IO.SearchOption - an enum with two values: AllDirectories and TopDirectoryOnly. Now let me be clear, I am not at all saying that types like this one are a bad idea. The problem, pre C#4, is that the more readable solution involved pulling your thumb out and writing more code, such as this enum. The BCL programmers could have been lazy and just made the Directory.GetDirectories() method accept a boolean 'recursive' parameter, but fortunately for us they weren't lazy and so they created an enum to make for more readable code. The good news is that with C#4, programmers such as myself can be lazy and leave the method signature using a boolean, safe in the knowledge that the caller can make a readable call by using a named parameter.

The downside to this approach is that rather than FORCING the caller to write readable code, you are HOPING they will elect to write the more readable option. So what is better? Well this is touching on a much bigger subject, the whole issue of how much trust you should place on the individual developer. I can’t give you a definitive answer for that, it depends on the environment and developer culture. Perhaps it would suffice to say that I would prefer to work in an environment where I am safe to force rarely, and hope often.

Wednesday, 29 October 2008

New Toys! (C# 4.0)

Today I’ve had a chance to play with the VS 2010 CTP VirtualPC image. Specifically, I’ve been playing around with the major new feature for C# 4.0, the DLR integration that allows you to declare an object as dynamic and then perform late bound operations on it. (Please note that it was too much of a pain to get the highlighting working for the dynamic'; just imagine it nice and blue in these code listings.)

I started off very simple :

dynamic hello = "hello world";
Console.WriteLine(hello.ToString());

This worked fine. What happens if I misspell the ToString method?

dynamic hello = "hello world";
Console.WriteLine(hello.ToStrig());

Some sort of runtime error is expected, obviously. Here it is:

Ok, so what is next? Well one thing that occurred to me when I saw the dynamic keyword is that it could be very useful when working with anonymous types. My problem with anonymous types has always been when you go to pass them to another method, you can’t do a whole lot with them when they come out the other side. Lets see if the dynamic keyword can make things nicer:

static void Main(string[] args)
{
    var testObj = new
    {
        Foo = (Action)(() => Console.WriteLine("foo!")),
        Bar = (Action)(() => Console.WriteLine("bar!"))
    };

    DoFooAndBar(testObj);
    Console.ReadLine();
}
static void DoFooAndBar(dynamic foobar)
{
    foobar.Foo();
    foobar.Bar();
}

Output: 
foo!
bar!

Ok well that works. I’m not sure what its good for though. In the past when I’ve wanted to do something like this with anonymous types, it was because I wanted dynamic behaviour. Surely now we have DLR integration, there will be a nicer way to achieve this without having to use anonymous types? I guess I will have to wait and see.

If you would like to read a bit more about dynamic, Chris Burrows has a decent post up. One point I found interesting is that he refers to dynamic as a type. Now the compiler certainly treats it as a type, but the CLR is certainly none the wiser:

static void Main(string[] args)
{
    List<dynamic> list = new List<dynamic>();
    Console.WriteLine(list.GetType());
    Console.ReadLine();
}

Output: 
System.Collections.Generic.List`1[System.Object]

Also, calling typeof(dynamic) won’t even compile:

So its a type, but not really. If you take a look at Chris’s post, he shows what the c# compiler generates when the dynamic type is used, and its quite clear that the resulting IL does not involve any “dynamic” type.

Another thing I tried was dynamic invocation of explicit interface implementation:

interface IFoo { void Foo(); }
class Foo : IFoo
{
    void IFoo.Foo() { Console.WriteLine("foo!"); }
}
static void Main(string[] args)
{
    dynamic foo = new Foo();
    foo.Foo();
    Console.ReadLine();
}

It doesn’t work. You get the same error as when I tried calling ToStrig() above. Static methods won’t work either:

class Foo
{
    static void DoFoo() { Console.WriteLine("foo!"); }
}
static void Main(string[] args)
{
    dynamic foo = new Foo();
    foo.DoFoo();
    Console.ReadLine();
}

Again, a RuntimeBinderException occurs. This behaviour is interesting. Here is an excerpt of the new features document (avaliable from the csharp futures site) discussing the C# runtime binder:

This is essentially a part of the C# compiler running as a runtime component to “finish the work” on dynamic operations that was deferred by the static compiler.

Its not JUST deferred. The binder is obviously not behaving precisely the same as the C# compiler. It doesn’t really surprise me that its not exactly the same. I’m sure someone will be able to give me a reasonable explanation.

The last thing I tried today was making a dynamic call within an expression tree:

static void Main(string[] args)
{
    dynamic hello = "hello world";
    Expression<Action> expTree = () => hello.ToString();
}

This fails to compile, while this is fine:

static void Main(string[] args)
{
    dynamic hello = "hello world";
    Action action = () => hello.ToString();
}

The new features document does not list this as one of the known limitations, so perhaps support for dynamic calls within expression trees is coming and just not yet implemented?

Saturday, 11 October 2008

C# in Depth

This is a book review for C# in Depth, by Jon Skeet. I started reading Jon's blog sometime around February this year, and found it to be quite great. I really enjoyed his attention to detail and curiosity about how things work. Jon struck me as someone that likes to teach, but takes great care when doing so to avoid myths, over generalizations or terminology misuse; he cares just as much about buggy communication as he does about buggy programs. Wanting to read more of his writing, I picked up C# in Depth and was very happy to discover a book crafted with the same dedication to detail and thoroughness that first attracted me to his blog.

C# in Depth has a very specific focus. It is aimed at developers that know C# 1, and want to learn the ins and outs of C# 2 and 3. When I say C#, I really mean just the language - the book does not cover the CLR or the class libraries. I expected some of the book to be familiar territory for me as I was already quite familiar with C# 2 and this was indeed the case. But it turns out there were still quite a few gaps in my C# 2 knowledge that the book helped fill in (for example, the distinction between unbound, open and closed generic types).

I particularly enjoyed the section on covariance and contravariance for generics. Specifically, Jon provided examples of particular covariant and contravariant operations that would be useful but are not currently supported. These examples were excellent. I have ran into similar sorts of situations in the past but never really understood how they could be potentially fixed. I have long struggled with the concepts of covariance and contravariance, mostly because I read the definition once every year and it never sticks, I have a glimmer of hope that this time it will.

The C# 3 content is great. I had a basic understanding of how each of the new features contribute to LINQ as a whole, but again there were plenty of gaps in my knowledge that the book helped with. I learned a great deal, in particular, from the section on LINQ query expressions and how they are converted to the underlying code.

In my opinion, C# in Depth is more of a teaching book than a reference book. I can still imagine myself referring back to particular parts that did not quite manage to stick, but really it is designed primarily to deliver an instructive lesson. This, however, should by no means be taken as a criticism. I found the teaching style made the book quite easy to read. Jon’s style is precise, and yet never stale. If only more books with such technical content were so pleasant and carefully edited (I do not recall noticing any technical errors).

C# in Depth rocks. Many times during my journey through the book I found myself breaking out for some special time with the C# compiler, just to see what sort of clever tricks we could get up to. The book has brought me closer to mastery of my current tool of choice. If you are a C# programmer and have a natural curiosity about its capabilities, details, and limitations then I am sure you will find this book rewarding.

Thursday, 21 August 2008

A less fragile way to invoke a generic method with a dynamic type argument

A fellow Fluent NHibernate contributor Andrew Stewart called out for ideas on how some of his code could be refactored and if you've programmed with me you will know that I jumped at the chance. Here is the rather unfortunate code:

var findAutoMapMethod = typeof(AutoMapper).GetMethod("MergeMap");
var genericfindAutoMapMethod = findAutoMapMethod.MakeGenericMethod(type);
genericfindAutoMapMethod.Invoke(autoMap, new[] { mapping });

Here is what Andrew wants to do (sort of):

Type type = GetSomeType();
autoMap.MergeMap<type>(mapping);

This is illegal because you can't pass a type instance as a type parameter to a generic method. For example, you can't declare a list of strings like this:

List<typeof(string)> stringList;

Andrew is calling a generic method but he doesn't know the type argument at compile time. The only way around this is reflection, and we all know that reflection is fragile - if someone changes the MergeMap method to be called MergeMapping, the code will compile and fail at runtime (hopefully when your unit tests are executed!). Fortunately, there is a way to make this block of code significantly less fragile. To do this, I'm going to rely on the handy ReflectionHelper that is part of the Fluent NHibernate code base. Here's the method I'm going to use

public static MethodInfo GetMethod(Expression<Func<object>> expression)
{
 MethodCallExpression methodCall = (MethodCallExpression)expression.Body;
 return methodCall.Method;
}

So if I pass this baby a lambda expression containing a method invocation, it gives me the corresponding MethodInfo. Sweet! Lets use it

var templateMethod = ReflectionHelper.GetMethod((AutoMapper a) => a.MergeMap<object>(null));

Notice the type argument of 'object' ? Its a placeholder. The parameter 'null' is also a placeholder. What you need to realise is that we are not going to -execute- the code in that lambda. The reflection helper is going to inspect it and give us a MethodInfo to work with. Before we can invoke the MethodInfo, we need to replace the first placeholder, which is why I have called this one templateMethodInfo. Lets replace the placeholder:

var realMethodInfo = templateMethod.GetGenericMethodDefinition()
                     .MakeGenericMethod(type);

The GetGenericMethodDefinition call lets you obtain a MethodInfo for an unbound version of MergeMap. Once the method is unbound, we supply the particular type argument we want, being 'type' in this case. MakeGenericMethod takes an array of types so this process would also work fine for a generic method that has more than one type argument. To further illustrate this process, here is another example:

public static void DoSomething<T, U>() { }

...

static void Main(string[] args)
{
 Action doSomethingAction = DoSomething<object, object>;
 MethodInfo info = doSomethingAction.Method;
 Console.WriteLine(info);
 info = info.GetGenericMethodDefinition();
 Console.WriteLine(info);
 info = info.MakeGenericMethod(typeof(string), typeof(int));
 Console.WriteLine(info);
 Console.ReadLine();
}

OUTPUT: Void DoSomething[Object,Object]() Void DoSomething[T,U]() Void DoSomething[String,Int32]()

Getting back to the task at hand, we can now invoke realMethodInfo, but we must be sure to pass the original intended argument 'mapping':

realMethodInfo.Invoke(autoMap, new[] { mapping });

And thats it! Now if MergeMap is renamed, the lambda body will be updated accordingly by the refactoring tools. Now this code is still fragile because if another type argument is added to MergeMap, this code won't fail until runtime. Lets specifically check for this case and throw an appropriate exception. Lets also wrap it all up in one nice convienent method:

public static class InvocationHelper
{
 public static object InvokeGenericMethodWithDynamicTypeArguments<T>(T target, Expression<Func<T, object>> expression, object[] methodArguments, params Type[] typeArguments)
 {
     var methodInfo = ReflectionHelper.GetMethod(expression);
     if (methodInfo.GetGenericArguments().Length != typeArguments.Length)
         throw new ArgumentException(
             string.Format("The method '{0}' has {1} type argument(s) but {2} type argument(s) were passed. The amounts must be equal.",
             methodInfo.Name,
             methodInfo.GetGenericArguments().Length,
             typeArguments.Length));

     return methodInfo
         .GetGenericMethodDefinition()
         .MakeGenericMethod(typeArguments)
         .Invoke(target, methodArguments);
 }
}

And call it:

InvocationHelper.InvokeGenericMethodWithDynamicTypeArguments(
             autoMap, a => a.MergeMap<object>(null), new[] { mapping }, type);

Ok so its certainly not as clear or robust as:

autoMap.MergeMap<type>(mapping);

But the former has the advantage of actually COMPILING, while the latter does not.

My first open source project

A few weeks ago, I joined an open source project that was just starting up. Its called Fluent NHibernate and provides an alternative to XML based or attribute based mapping of NHibernate domain objects. The project owner James Gregory does a much better job of explaining it than I could.

Why am I only mentioning it now? Well during the first week I was too busy trying to get up to speed and make a useful commit to blog about it. As for the last couple of weeks, I've been holidaying in France and Italy and its been hard enough just keeping up with my RSS and mailing lists.

It has been a great experience so far. The other contributors are really friendly and smart, and I am impressed by the sheer volume (and quality) of work being performed. I'm learning how to use Subversion and really missing continuous integration (yes I already broke the build, managed it on my 2nd commit).

Incidentally, this is my first post using Windows Live Writer. It seems nicer than the Blogger web interface.

Sunday, 8 June 2008

ASP.NET MVC with Visual Web Developer Express

I decided to spend some time playing with ASP.NET MVC this weekend but the install wasn't as straightforward as I would have liked. I do not have Visual Studio 2008 so I've been relying on the express editions, and this somewhat complicated the process. Scott Guthrie's post mentions that Visual Web Developer Express 2008 does not support ASP.NET MVC preview 3 unless you install the SP1 Beta. Unfortunately, the link he provides in that post points to a post about the Visual Studio 2008 SP1 Beta, which kept giving me the following message when I ran it:

Microsoft Visual Studio 2008 SP1 (Beta) does not apply, or is blocked by another condition on your system. Please click the link below for more details.

I googled a bit and tried a few things and eventually discovered that there is a different service pack for the express editions: Visual Studio 2008 Express Editions Service Pack 1 Beta. No wonder I was getting that error message - I do not have any Visual Studio 2008 products installed! The existence of a separate service pack makes perfect sense and is pretty obvious if you think about it. Beware the dangers of mechanically clicking links and googling without thinking! The MVC install went smoothly once I installed the Express SP1 beta.

Thursday, 29 May 2008

Ayende's weird code

I spent the better half of yesterday evening playing with some code that Ayende posted. He wanted to make this compile:

public interface IDuckPond
{
    Duck GetDuckById(int id);
    Duck GetDuckByNameAndQuack(string name, Quack q);
}

IDuckPond pond = null;
pond.Stub( x => x.GetDuckById );
pond.Stub( x => x.GetDuckByNameAndQuack );

My first reaction was WTF! I didn't understand at all what he was getting at, but obviously he was making sense to some people, as the meaningful (but initially over-my-head) comments were quickly piling up. Lets take a look. The interface definition is completely standard, nothing out of the ordinary at all. The weird bit is definately this Stub method. It appears to be a void method that accepts a lambda expression as its only parameter. But hang on a sec, lets forget the lambda expression for a second; this method isn't declared on the IDuckPond interface, and aren't we going to get a null reference exception anyway? Well, not if Stub is an extension method. Now its certainly unclear to me how Ayende was going to do anything meaningful within the extension method given that you can't use the ref or out keywords on the target of an extension method so you are still gonna be stuck with pond = null after the first stub call returns, but nevermind that. So the extension method is going to look something like this:

static class StubExtender
{
    public static void Stub(this IDuckPond pond, SOME_LAMBDA_THINGY)
    {}
}

Where SOME_LAMBDA_THINGY is obviously the lambda expression parameter bit. So what goes there? Well, lets take a look at the first lambda expression that gets passed to Stub:

x => x.GetDuckById

This certainly had me puzzled at first. He's not calling GetDuckById here, because there is no argument.. so what's going on? Thinking about it a bit I realised that the x.GetDuckById is a delegate, which means that the parameter to the Stub method is a function that returns a function. Brain starting to hurt yet? If so, you must be a functional programming newbie like me. Having made this realization, it was pretty quick to get the code compiling. I defined delegates for the two methods on IDuckPond:

delegate Duck GetDuckByIdDelegate(int id);
delegate Duck GetDuckByNameAndQuackDelegate(string name, Quack quack);

And then two overloads for the stub method:

public static void Stub(this IDuckPond pond, Func<IDuckPond, GetDuckByIdDelegate> func) { }
public static void Stub(this IDuckPond pond, Func<IDuckPond, GetDuckByNameAndQuackDelegate> func) { }

Okay, so the first overload takes a function that takes an IDuckPond and that returns a function with GetDuckById's signature. The second overload does the same thing but its parameter is a function that returns a function with GetDuckByNameAndQuack's signature. I quickly realised that I don't actually need the explicitly declare delegates, because I can just use the framework delegate Func, which I was already doing anyway. So the delegates got deleted and the overloads became this:

public static void Stub(this IDuckPond pond, Func<IDuckPond , Func<int, Duck>> func) { }
public static void Stub(this IDuckPond pond, Func<IDuckPond , Func<string, Quack, Duck>> func) { }

So this was better, but certainly not perfect. It was obvious that Ayende was looking for a general solution. The first generalization was simple.. make the stub method generic on the extension target:

public static void Stub<T>(this T obj, Func<T, Func<int, Duck>> func) { }
public static void Stub<T>(this T obj, Func<T, Func<string, Quack, Duck>> func) { }

The cool thing about this is that I didn't have to change the calling code at all. Specifically, I don't have to call pond.Stub<IDuckPond>(etc) - why? Because the compiler does generic parameter inference; it is able to determine that the type T is IDuckPond because we are passing in an IDuckPond as the first argument. But obviously this is still not ideal because as you add new methods to your interfaces, you will need to add the appropriate Stub overloads. You could probably code gen them, but code generating just for the sake of nice syntax seems to be taking things a bit far. I wanted to see if I could come up with anything nicer, and some of the comments in the thread gave me some ideas. This one is okay:

IDuckPond d = null;
d.Stub(x => (Func<int, Duck>)x.GetDuckById);
d.Stub(x => (Func<string, Quack, Duck>)x.GetDuckByNameAndQuack);
    
static class StubExtender
{
   public static void Stub<T, U>(this T obj, Func<T, U> func) { }
}

At least this way we can get by with just one definition for Stub, though at the cost of adding some casts. Now why did I have to add those awful casts? Well I modified the Stub method to be generic, but the compiler can't infer the generic parameter U like it is with T. Why? Lets say that IDuckPond had a third method, an overload of GetDuckById:

interface IDuckPond
{        
    Duck GetDuckById(int id);
    Duck GetDuckById(object o);
    Duck GetDuckByNameAndQuack(string name, Quack quack);
}

Now when we say x.GetDuckById, its unclear which overload we are referring to, and its important because the delegates would have different signatures. As a result of this fact, the compiler cannot infer the type, so you have to give it explicitly, by casting. Some other neat alternatives were suggested by Jacob Carpenter, you can find those in the comment thread for Ayende's post if you are interested. He's doing something tricky with the default keyword, some investigation is in order methinks. Unfortunately there does not appear to be a perfect solution. You can't get away with the exact code that Ayende specified without writing or generating all the required overloads. I really enjoyed this little challenge, its reminded me that I'm not as good with C# as I would like to think, and instilled me with the urge to do more weird stuff. Expect more posts filled with angle brackets!

Thursday, 22 May 2008

The Mythical Man-Month

Phew, has it already been a month since my last post? There's no doubt about it, GTA IV is to blame. In any case, I haven't been slacking off completely. I just finished reading the rather short and positively ancient The Mythical Man-Month: essays on software engineering by Fred Brooks. This book is considered to be one of the classics. It appears in "must read" lists all the time, and is constantly references by software bloggers around the world.

Originally published in 1975, this book is older than me. I'm reading the 20th anniversary edition which has some added chapters, but the original contents are left as they were - and it sure does show. The book is more about software management than engineering and if offers several valuable ideas, but its also full of discussion that is simply not at all relevant to software engineering today.

The book makes the point that you can't rush software. Doubling your staff will not halve your schedule, and adding manpower to a late software project will probably just make it later. It tackles this concept that men and months are not interchangeable from a number of different angles, and while I consider this idea to be blindingly obvious today, I can appreciate it might not have been quite that way 33 years ago. The 20th anniversary edition contains the "No Silver Bullet" essay which was written about a decade after the first printing of the book. This essay claims that it is unreasonable to expect that any single technological or methodological advance will deliver an order of magnitude improvement to software productivity. Developing software is an inherently difficult complex process, and we can chip away at the difficulties from many different angles, and perhaps achieve order of magnitude improvements through combination of many tools and techniques, but there is no discovery waiting around the corner that will improve our software development productivity by a factor of 10.

I think these ideas are worth reading about, and the book has certainly broadened my perspective and given me insight to how people thought about software development many years ago. However, that said I am reluctant to recommend this book. MUCH of it talks about the practicalities of software development circa 1975 - so its filled with arcane meaningless terms from an era long gone. I couldn't help but laugh at the idea of embedding documentation into code having the downside that "you have to store more characters". I'm now wishing I was taking notes as I read because there were some real whoppers in there, stuff that really makes me appreciate how far we've come. But while some of it was funny, most of it was just snore worthy.

We need a modern replacement for this book. A text that communicates some of the old but worthwhile ideas along with some the new, such as iterative development. A text that I could give to someone else and not have them hand it back upon the first mention of some obscure mainframe operating system that I or they haven't heard of. And while I'm at it: a text that does not make the assumption of the existence of this abstract concept of "God" several times would also be nice - there is no place for that sort of thing in a software text. The man-month ain't the only concept mentioned in this book that I consider to be mythical.

Maybe such a book does exist, and I just don't know about it yet. If you think you've found it, be sure to let me know!

Tuesday, 22 April 2008

Domain Driven Design (the book)

I just finished reading Domain Driven Design: tackling complexity in the heart of software by Eric Evans. This is a book that was recommended to me years ago but I was lazy and ignored the advice - and what good advice it was. Hopefully you are familiar with the concept of a domain model. Implementing a domain model is by far my preferred technique for dealing with business rules in software, probably because so much of it leverages the strengths of object oriented programming (and don't get me started about how much I love OO). The thing about domain models however, is that getting one up and running is only half the battle. I am big fan of the Martin Fowler book Patterns of Enterprise Application Architecture and much of it deals with accomplishing this task. But the thing is, there are already plenty of object-relational mappers avaliable, both commercial and free. Given this fact, its probably incorrect of me to refer to technical aspects of using a domain model as "half the battle" as it should make up a far smaller proportion. The real battle is putting a domain model to work, leveraging its strengths and guarding against anaemia. Designing a domain model of true value is no easy task and it is this particular problem that Evans focuses on in DDD. I really enjoyed the first half of this book, being the two sections Putting the domain model to work and The building blocks of a model driven design as I found then easy to follow and very relevant to the work I am doing. Some of the concepts described were familiar and worth reinforcement, but several were completely new. The second half of the book consists of the sections Refactoring towards deeper insight and Strategic design. I found these sections a bit more uneven than the first, and particularly dry in places. Ever caught yourself and realised that your eyes scanned the last half a page without any of it actually sinking in? This happened to me more than a few times, but I think its as much my own fault as it is Eric's. Putting aside these complaints, the second half of the book is still a very worthwhile read as it introduces many valuable concepts. Its probably worth mentioning that there is not a great deal of code in this book as it is much more concerned with theory than with the hands-on practicalities of developing with a domain model. I can recall a few instances where the book describes a mechanism that made a lot of sense, but it was not entirely clear how I would actually go about implementing it. I am considering picking up a copy of Applying Domain-Driven Design and Patterns by Jimmy Nilsson in the hope it might help fill in some of the gaps. It occurs to me that there might have been a better way to approach the topic of DDD and it might have made Evans a bit more $$. If you take the first two sections of the book and add a few choice selections from the remaining half, you've got an excellent text for introducing developers to domain modelling. You could hand a developer that wasn't experienced with using domain models a 200 page text and they would come back in a few days excited and with a firm knowledge foundation. The majority of the second half of the book consists of material that is more relevant to a developer in a leading or mentoring role and this would easily fit in a follow up, "advanced" book. But don't let my theories on how the material might have been better packaged distract you, because DDD is still a really good book. A few years ago, when I was reading CLR via C# by Jeffrey Richter, I felt kind of embarrassed that I had been doing all this .NET development without much of a clue of what was going on. Reading DDD made me feel much the same way about building applications that use domain models.

Wednesday, 16 April 2008

RSS

RSS stands for Really Simple Syndication and is basically about recording information about updates to a website in a structured format. What you actually need to know about RSS is that most websites these days have RSS feeds and this means that you can do alot of web browsing from within an RSS reader. I use Google Reader because it works via a browser so I can access it from home, work, or my mobile phone and they all stay in sync. RSS is amazing. I thought it was cool when I first discovered it years ago because it meant I didn't actually have to go to my favourite sites to see if they had been updated. And yes, fundamentally, years later, that is still all I use it for. But the really important difference is that initially, I used it to make sure that I didn't miss a Penny Arcade comic while now I use it to plug my brain into lots of really smart people's brains (and also never miss a PA comic). Steve Yegge. Paul Graham. Reginald Braithwaite. These guys are brilliant. I just bought Reg a Darkhorse double espresso via his handy donation link and I would be doing the same for Steve if he made it similarly easy (Paul Graham doesn't need my money, his brilliance has already brought him fat wads of cash). I discovered these names via the ever growing torrent of information that hits in me in the face every morning, noon and night (yep, my brain needs its breakfast, lunch and dinner the same way my body does). There are many other great bloggers that I subscribe to, but the above 3 are special because they are inspirational. Their writing has changed the way I think of myself, my work, and my future. I have struggled in the past to come up with a singular example of why Steve's writing is so great and this evening I went through the same process with Reg. These people are inspirational because they each have their own particular perspective on software development and these perspectives are refined, logical, and exciting. Furthermore, these perspectives are communicated through the aggregate of their writing, rather than any individual piece. Much of my effort to generalize is RSS assisted. I have found the programming sub-reddit to be especially useful as it is focused on less common programming languages such as Ruby, Python, Lisp, Haskell, etc rather than the heavyweights (.NET and Java). My RSS subscription count has probably doubled since I discovered Reddit - its an excellent source of worthwhile blogs. The world wide web has had an incredible influence on my life and RSS has significantly changed how I consume the web's content. Do you ever stop to wonder "gee, how on earth did we ever do anything without the internet"? Well now I'm wondering how I ever got by without RSS.

Monday, 31 March 2008

Ruby on Rails, IDE's and autocompletion

Over the last month I've been tinkering with a web development framework called Ruby on Rails. Its a relatively new framework (only a few years old), quite popular and very different to ASP.NET. My progress with Rails so far has been pitifully slow, for a few different reasons: I've managed to make things harder for myself by trying to learn the newest version (released Dec 2007) which limits the availability of appropriate tutorials and textbooks, I don't have much of an attention span at home and so I'm doing various other things at the same time, and there is one other reason that I'll discuss in a moment. My original plan was to play with Rails from within Heroku, a Rails web host currently in beta that offers a browser based integrated development environment, but I soon realised that it was not appropriate for a newbie learning the ropes. I already had Ubuntu running on my laptop so I went through the process of installing it but then there was still the matter of which IDE to use. Its funny - as a .NET developer I never really had to worry about IDE's. Visual Studio is fantastic and there is little in the way of realistic alternatives. I've never wished for a radically different IDE, just enhancements to VS. But step outside of the world of Microsoft based development and suddenly there are many alternatives and little in the way of consensus of what is 'best'. The majority of Rail's tutorial videos are done on a Macintosh using TextMate, but that is not option for me. When it comes to Rails development on Linux, there are several options and for now I've settled on RadRails, a plugin for Aptana Studio, which seems to be some sort of fork of Eclipse. Ok so why all this harping on about IDE's? Well it turns out that the biggest lesson I've learned so far isn't about web development, but about how I learn frameworks/api's in general:

AUTOCOMPLETION

Known as Intellisense in the Microsoft world, autocompletion is my bread and butter. I will use the documentation to find the necessary namespace and then explore it with autocompletion. Only now am I truly aware of how much I rely on it. If you asked most developers about autocompletion, they'd say "oh yeah, its great, it saves heaps of typing". This is certainly true, but saved key strokes is not where the real benefit lies for me: its a discovery mechanism. I found it difficult learning Ruby without autocompletion (and still do) . Its an unfortunate fact that autocompletion is harder to implement and in general less useful for dynamically typed languages. Autocompletion for Rails would be particularly difficult to implement because so much of Rails is metaprogramming magic. I expect to talk about this in more detail in some future posts. For me, Ruby's impressive support for metaprogramming is its most exciting feature. Much of what makes Rails worthwhile relies on this. But it is this very feature that makes meaningful autocompletion so difficult to achieve, which makes it a double edged sword. Its obvious to me that I need to learn to get by without my crutch. And learn I shall, because while RadRails claims to support autocompletion, I've been hitting a whole lot of ctrl+space and so far I'm not impressed.

Monday, 17 March 2008

The Pragmatic Programmer

I just finished reading The Pragmatic Programmer: from journeyman to master by Andrew Hunt and David Thomas. I'll be straight with you: I bought it because it is listed in Steve Yegge's list of great books. Okay, that's not the only reason; there are plenty of other bloggers that also have great things to say about it. I hoped that it might be particularly relevant to me because sometimes I am too much of a purist/perfectionist and not pragmatic enough. In this regard I misunderstood the title - I expected the subject matter would touch on the tension between perfection and pragmatism, but this is not the case. Calling the book "The Smart Programmer" would be more exact, though probably not as marketable. All programmers think they are smart, so who would buy a book with a title like that? This book is a bit odd. It reads more like a series of blog posts about good software development practice than a textbook. Part of the reason I say this is the sections are quite short and diverse - it never goes into a great deal of depth. Topics covered include error handling, debugging, testing, refactoring, gathering requirements, development environments, algorithms, prototyping and many more. But through all these topics, I can't really point out anything the book discussed that I wasn't already familiar with. Hence my previous comment about the title. Its about being a better and faster programmer - I don't equate this with pragmatism. The reason The Pragmatic Programmer had so little to offer me is that I've already read so many blogs and articles that touch on the same subject matter. Good software developers analyze their own actions and constantly seek to improve their processes and expand their knowledge. If you've already been doing this for a few years, then this book probably won't offer you much beyond some reinforcement. I can't help but wonder whether I will be saying the same thing about Code Complete by Steve McConnell once I finally get around to reading it. Its an extremely popular book, often listed as the number one mandatory read for developers. Apparently much of the book is 'common sense' and that rings very true of The Pragmatic Programmer also. Next book on my list is Domain Driven Design by Eric Evans. I was told to read this years ago but I was stupid and ignored the advice and now I'm worser for it. But, its never too late to set things straight - I'm a few chapters in and its proving to be excellent.

Monday, 10 March 2008

Google analytics and accidently generating yourself some traffic

Google Analytics is this great service that gives you all sorts of stats on your website. Its easy to set up - all I had to do was paste a bit of javascript into my template. Here's an example screenshot I took a couple of weeks ago when I was first considering writing a post on Analytics: It shows me getting a couple of visits a day, with most coming from Australia. A few days ago, something new happened. I got some random comments on my post about gmail addresses (I say random because they weren't from people that know me). This is what was waiting for me when I checked analytics: Hmmm.. what the hell? Where did all these people come from? Well, Analytics is more than happy to tell me: That referring page was the original post (on the official gmail blog) I linked to about the gmail addresses. It turns out that at the bottom of the post, there is a list of links, and my blog post is listed. All automatic of course, but phew, what a way to generate yourself some traffic!

Friday, 7 March 2008

A minor correction

Yesterday I proposed signing up to stuff that might spam me using the email address paul.batum.spam@gmail.com so that I could use a gmail filter to label it easily. That was supposed to be paul.batum+spam@gmail.com. Whoops There's something else I didn't think of: I'm not sure if the + symbol is generally accepted when validating email addresses.

Thursday, 6 March 2008

My way cooler email addresses

So it turns out I have an extremely large number of cool email addresses that I didn't even know about. Here is an email I just sent: And wheeee, here it is! For the full details go read the post on the official gmail blog. While coming up with crazy variants on your email address and giving them to people is funny, the linked post points out the real advantage of this behaviour: you can set up filters to label and process your email based on the address it was sent to. I often end up setting filters to hide or simply label certain email that isn't ACTUALLY spam - but now instead of having multiple filters I can just use paul.batum+spam@gmail.com to register for stuff and use a single filter. So what are you waiting for, send me an email at paul.batum+gmail=happy.geek@gmail.com to celebrate this discovery!

Wednesday, 5 March 2008

Two months

Its been two months since I started writing this blog and in general I'm pretty happy with the way its going. I have successfully made a significant change to how my free time is spent - I'm playing less games, reading more and programming more. The act of maintaining this blog is sustaining this significant change by making me feel accountable. Basically, the plan is working. That said, I was pretty slack over the last two weeks and didn't post much. I'm annoyed that I let that happen - somehow I was telling myself that I didn't have anything to post about but that was just a pack of lies and an attempt to justify a spell of minor lazyness. The interesting fact is that I find much of what I'm trying to learn challenging but writing blog posts of a satisfying (for me - I make no promises about them being satisfying for you) quality is probably the hardest task of the lot. This tiny set of paragraphs has taken me literally hours to write and it is this very fact that makes it so important I keep posting consistently. If it takes me hours to write a few hundred words about nothing at all then imagine how long it will take me to write blog posts where I actually have a point to make! Hopefully by the time I start having ideas of communicable merit my writing will be sufficiently practiced that I can count the hours I need to write about the idea on one hand. Wow, that previous sentence took me like 5 minutes and it still doesn't make any sense. Those of you that are reading this do so because you know me (if somehow I'm wrong and you are random visitor, stop wasting time here and go read something of value - don't worry, I'll still be here when you come back in a year or two, and hopefully by then I will also have something valuable to offer). Hopefully I am proving to be at least mildly entertaining / interesting / motivating for you guys but I am painfully aware of just how much I'm relying on the whole its-worth-reading-because-its-someone-I-know factor. Ohwell, practice makes perfect.

Monday, 25 February 2008

More adventures in the land of free operating systems

I didn't get much programming done this weekend because I spent most of my time screwing around with my Linux installation, much of which was just fixing things I'd managed to break somehow. My two goals were to switch to KDE and get some sort of remote desktop solution configured so that I could program on my laptop's Linux environment via my Windows desktop. I'm pleased announce success on both counts! It appears that virtually everything is customizable when running Linux and the desktop environment is no exception. As far as I can tell, the two most popular are GNOME and KDE. Ubuntu comes with GNOME but I wanted to try KDE because I had heard that it was more windowsy and shiny. It turns out there is something called Kubuntu which is Ubuntu with KDE instead of GNOME and so installing the kubuntu-desktop package did the trick. Surprisingly, you can actually have both installed simultaneously and choose which one you want when logging in! So far I prefer KDE's style and plan to stick with it for the time being. The two products differ on many levels beyond the aesthetic but I'm too much of a noob to appreciate most of them. One of the significant aspects of the desktop environments is the extra software they come with such as email client, news client, contacts management, etc. The thing is, I really don't care about this stuff because I use this thing called a "web browser". This magical device is slowly but surely making desktop applications irrelevant. Annoyingly, all these extra apps are somehow dependencies of the kubuntu-desktop and so they will be staying installed for now. I am sure there is some way to get rid of them without blowing the whole thing up but it will no doubt involve editing obscure configuration files and executing cryptic commands. Hang on, if I don't like that stuff, why the hell did I install Linux again?? I was asking myself this very question today when I was summarily dumped into a command line after installing the kubuntu-desktop and then rebooting. So I rebooted again and was greeted by the command line again. Wait a minute, I was supposed to have TWO desktop environments, not zero! Various googling and typing of commands happened and somehow it got fixed - I forget the details. So anyway, the other thing I managed to do this weekend was use some software called freeNX to get a remote desktop solution up and running. I had to research for quite a while before I found a guide that was appropriate for my setup and not horribly out of date. Following the guide carefully, it all went very smoothly. I am happy to report that the Windows NX client operates very similarly to good 'ole mstsc.exe in terms of performance. Now I can do all my ruby/rails/android/etc development in Linux and do so from either machine. All this tinkering is time consuming but not without value. I am learning alot and that is the primary goal, after all.

Sunday, 24 February 2008

Status update: compilers and ruby on rails

So you probably noticed I haven't been posting much about the parsing stuff lately and thats because I stopped working on it. I got to the point where wikipedia + lecture notes isn't quite cutting it and I've been unable to find a resource that tackles the subject matter in a way I am comfortable with. A good textbook could make a big difference and there are a few out there that look promising but my heart is not in it at the moment. I'm somewhat torn in between wanting to finish the projects I start and making sure that I am enjoying the projects. However the simple fact is that this plan of less video games and more programming/reading/learning is only sustainable if I am really enjoying the work. My intention is to come back to the compilers stuff in a few months and spend time looking at some of the programs out there that make compiler/interpreter development easier. Previously I avoided this because I wanted to learn the fundamentals and I found this to be worthwhile. It was never my intention to implement an actual compiler or interpreter completely from scratch, although I had hoped to get somewhat further than I did. In any case, the topic that holds my interest at the moment is that of a web development framework called Ruby on Rails. Rails interests me for a number of reasons but I think the primary driver is frustration with ASP.NET at work. I have read about the model-view-controller pattern but never actually worked with it first hand so I am curious about how it compares to developing with web forms. Rather than downloading and configuring Rails locally I am making use of Heroku - they provide free Rails hosting with an integrated development environment. I'm sure you will hear how I'm finding it soon enough.

Monday, 18 February 2008

I installed Ubuntu on my laptop

If you read the first two posts on this blog you might notice a discrepancy. In the first post I identified that I only know how to use Windows. But if you take a look at the second post, I did not propose a remedy. I'll be honest - its because I really did not want to learn Linux. I am unable to give a good reason for this so how about a bad one: fear of change. The thing is, this entire project is about moving outside of my comfort zone and learning Linux is no different so I bit the bullet on Friday night and burned the iso for Ubuntu 7.10. (Random aside - it was a cd iso but I tried burning it onto a dvd, and surprisingly it worked! I was expecting a coaster.) I was immediately impressed. I had no idea that you could boot and run some Linux distributions directly from the disc (perhaps some of you are rolling your eyes at my ignorance? If so, get used to it - there is plenty more where that came from!). The ability to preview an operating system without installing it certainly has some merits. I was able to establish that my wireless would work fine and investigate the options the installer gave me all without making a commitment. I played around for an hour or so and then proceeded with the installation. I had the option to do a "guided" install which would resize some of my partitions but it failed to inspire much confidence so instead I did some reading and set the partitions up manually. Configuring dual boot required zero intervention on my part as it was all automatic and best of all it worked first go. As a basic user, I would say that the two most confusing things about Linux are user permissions and the file system. If I want to copy things to certain directories it seems I have to use a command line and execute the command as the root user. But what if I want to use the nice explorer-style gui for doing this stuff? Figuring out which directories to actually put stuff in is also difficult as there is no obvious program files location. There are definitely things to like as well. For example, I like being able to install applications via a single command such as "sudo apt-get install sun-java6-jdk" ( now I finally understand this xkcd, yay!). Its quite convenient! The small memory and disk space footprint is obviously great. I only set aside 15gb for the Linux partition and it looks it will be more than enough. There are some random quirks that I sort of expected when using a laptop without all the custom driver stuff. Last night I discovered that my brightness-up key doesn't work but the brightness-down key does - I ended up having to reboot the darn thing because my random mashing had made the screen so dark. I've done a bit of googling and it looks like there are pages out there with specific instructions for running Ubuntu on an A8Js so I expect with time I'll be able to iron out most of the quirks. In summary this experience has been remarkably positive. Its great how easy it is to get a taste of the "other side" without making a commitment. If you are at all curious why not just burn an iso and pop it in your drive? I should have done this sooner!

Thursday, 14 February 2008

Fixing the grammar

The test that was failing was this one:

  def test_left_compound # "((500*20)+16)"
  tokens = [LeftParenToken.new(), LeftParenToken.new(), IntegerToken.new(500), MultiplicationToken.new(), IntegerToken.new(20), RightParenToken.new(), AdditionToken.new(), IntegerToken.new(16), RightParenToken.new()]
  assert(tokens_are_valid(tokens))
end

Why? Well, do what I did - try to build the expression ((500*20)+16) by hand using the rules. Here they are again:

E => (E) | nT T => +E | -E | * E | /E | ε

Give up yet? It can't be done! Also, for the same reason, this grammar cannot generate (1)+2. The simple fact is that I botched the job of making the grammar LL(1) compatible. So how to fix it? Well its simple really, especially when you take the (1)+2 example. Seems to me that a slight tweak to the production rules for E is necessary:

E => (E)T | nT

Great! Lets modify the lookup table accordingly:

lookup[:E][LeftParenToken] = [:lpar, :E, :rpar, :T]

The tests all pass! Now I can turn my attention to using the same algorithm for building the tree. However I have some more reading to do before I can proceed. Come on CS164 lectures notes, don't let me down!

Monday, 11 February 2008

Syntax checking with a LL(1) parser

After reading about some different types of parsers, I decided to have a go at LL(1). This means the parser works Left-to-right, generates the Leftmost derivation and looks one token ahead. The basic idea of LL(1) is to build a stack of symbols that represent the derived expression by using a lookup table that tells you what rule to apply based on the current symbol and current token. Not all context free grammars can be parsed using LL(1) but I think the calculator's grammar can, with some modification. Lets break the work into some smaller steps:

Modify the grammar I described last week to support LL(1) parsing.
Modify the tokens_are_valid method to use LL(1) to verify whether the token stream is well formed. Given that I already have tests for tokens_are_valid, I will know that I am already on the right track once the existing tests are passing with the new implementation.
Add syntax tests for parentheses. Watch them fail.
Modify the tokens_are_valid implementation to make the new tests pass. Once this is happening we are done with using LL(1) for syntax verification and can move on to using LL(1) to actually do the parsing.
Modify the build_tree method to use LL(1) and watch the operator precedence tests fail because my grammar doesn't deal with the precedence issue.
Fix the grammar to support operator precedence, make the corresponding changes to the lookup table and watch the tests pass.
Add evaluation tests for parentheses . I think these babies should automatically work if I've done the other steps right.

In this post I will tackle steps 1,2 and 3. Here is the grammar I described last week:

E => n | E+E | E-E | E*E | E/E | (E)

As it stands, this grammar will not work with LL(1) because to figure out which rule to apply, I have to look at least 2 tokens ahead. For example, starting with E and given two different inputs "1" and "1+1", its obvious that I need to use a different rule for each input, but both of them start with the token "1". This problem can be fixed by modifying the grammar. Here goes:

E => (E) | nT T => +E | -E | * E | /E | ε

The symbol 'ε' seems to represent null or empty. Basically it means that the symbol T can just disappear without producing anything. Resuming the above example, now there is only one rule to apply: E => nT. In the "1" case, the T disappears. In the "1+1" case the T is expanded to +E. Here is the lookup table for this grammar:

	n	+	-	*	/	(	)	$
E	nT					(E)
T		+E	-E	*E	/E		ε	ε

The '$' represents the end of the input string. The gaps in the table represent syntax errors. Writing the code There are a few issues that need addressing when it comes to implementation. The most obvious one is that I need some way to represent the symbols in the above table. Well it turns out Ruby has a symbol class, though it was not designed to represent grammar symbols - Ruby symbols are just named constants that have no intrinsic value(as far as I can tell). They use the :symbol_name syntax, so for example E and T are now represented as :E and :T. I amended the existing token classes to expose their corresponding symbol:

class AdditionToken < OperatorToken
def symbol
  return :plus
end
end

I need a way to determine if a symbol is terminal or not, and there is a rather nice way to do this - simply defining an extra method on the Symbol class:

class Symbol
def non_terminal?
  return (self == :T || self == :E)
end
end

Just to be clear, Symbol is not my class, I'm just adding an extra method to an existing class. Executing :T.non_terminal? will return true. Next on the agenda is the end token. We don't strictly need an end token but it does make the algorithm simpler.

class EndToken
def symbol
  return :end
end
end

The lookup table can be done as 2 dimensional hash:

def create_lookup_table()
lookup = {}
lookup[:E] = {}
lookup[:E][IntegerToken] = [:int,:T]
lookup[:E][LeftParenToken] = [:lpar, :E, :rpar]
lookup[:T] = {}
lookup[:T][AdditionToken] = [:plus, :E]
lookup[:T][SubtractionToken] = [:minus, :E]
lookup[:T][MultiplicationToken] = [:multiply,:E]
lookup[:T][DivisionToken] = [:divide, :E]
lookup[:T][RightParenToken] = []
lookup[:T][EndToken] = []
return lookup
end

As you can see, the table is first indexed with the non-terminal (:E or :T) and then with a token type. The result is an array of symbols, which may be empty. If an attempt to lookup an undefined entry is made then nil is returned and this is interpreted as a syntax error. Right, with that out of the way, I can begin work on tokens_are_valid. The goal is to replace the old implementation with an LL(1) algorithm and hopefully all the existing tests will pass:

def tokens_are_valid(tokens)
tokens = tokens.dup
symbols = [:E]
lookup = create_lookup_table()

# The symbol and token arrays can be used as stacks if the contents is first reversed.
# An end token/symbol is appended to each before they are reversed.
tokens.push(EndToken.new())
tokens.reverse!
symbols.push(:end)
symbols.reverse!

while(tokens.size > 0)
  current_symbol = symbols.pop() #The current symbol will always be consumed. Safe to pop.
  current_token = tokens.last    #The current token won't necessarily be consumed, cannot pop it yet.
  if(current_symbol.non_terminal?)
    result = lookup[current_symbol][current_token.class]
    return false if result.nil?   
      symbols.push(result.reverse)
      symbols.flatten!     
  else
    return false if current_symbol != current_token.symbol
    tokens.pop()
  end             
end

return true
end

Hopefully this is relatively self-explanatory. The bit that may need some explanation is in bold. See, I am using arrays as stacks but to do so I need to reverse the contents - I do this once at the beginning for the tokens and the symbols. However its also necessary to reverse the result of the table lookup for the same reason. Yes, I could have coded the table with the symbols backwards but that is more confusing. Easiest just to push a reversed copy of the result onto the symbol stack. The only problem with doing this is that you end up with an array inside an array, which is why I call the flatten! method. In case you were wondering, Ruby has a convention to use the exclamation mark for methods that modify the instance you called the method on, rather than just returning a modified object. The following should demonstrate:

irb(main):001:0> x = [1,2,3]
=> [1, 2, 3]
irb(main):002:0> y = x.reverse
=> [3, 2, 1]
irb(main):003:0> x
=> [1, 2, 3]
irb(main):004:0> x.reverse!
=> [3, 2, 1]
irb(main):005:0> x
=> [3, 2, 1]

Well, my proposed change to tokens_are_valid works fine - all my tests are still passing! The next step is to write some tests that include parentheses. Here are a few (no need to bore you with the full set):

class ParenthesesSyntaxTests < Test::Unit::TestCase
def test_surround_one_number # "(1)"
  tokens = [LeftParenToken.new(),IntegerToken.new(1), RightParenToken.new()]
  assert(tokens_are_valid(tokens))
end   

def test_surround_operation # "(1+2)"
  tokens = [LeftParenToken.new(),IntegerToken.new(1), AdditionToken.new(), IntegerToken.new(2), RightParenToken.new()]
  assert(tokens_are_valid(tokens))
end

def test_left_compound # "((500*20)+16)"
  tokens = [LeftParenToken.new(), LeftParenToken.new(), IntegerToken.new(500), MultiplicationToken.new(), IntegerToken.new(20), RightParenToken.new(), AdditionToken.new(), IntegerToken.new(16), RightParenToken.new()]
  assert(tokens_are_valid(tokens))
end

def test_right_compound # "(16+(500*20))"
  tokens = [LeftParenToken.new(), IntegerToken.new(16), AdditionToken.new(), LeftParenToken.new(), IntegerToken.new(500), MultiplicationToken.new(), IntegerToken.new(20), RightParenToken.new() , RightParenToken.new()]
  assert(tokens_are_valid(tokens))
end

def test_empty_parens_error # "()"
  tokens = [LeftParenToken.new(), RightParenToken.new()]
  assert(!tokens_are_valid(tokens))
end

def test_wrong_order_error # ")2("
  tokens = [RightParenToken.new(), IntegerToken.new(1), LeftParenToken.new()]
  assert(!tokens_are_valid(tokens))
end

def test_no_operator_adjacent_to_paren_error # "1+2(3+4)"
  tokens = [IntegerToken.new(1), AdditionToken.new(), IntegerToken.new(2), LeftParenToken.new(), IntegerToken.new(3), AdditionToken.new(), IntegerToken.new(4), RightParenToken.new()]
  assert(!tokens_are_valid(tokens))
end

end

All but one pass! Can you guess which one? It turns out this is a problem with my grammar, not the implementation. Next post I will discuss the failing test and what I have to do to fix it.

Saturday, 9 February 2008

More on context free grammars

I started writing a post on the work I've done this week but soon realised that it would be useful to explain grammars a bit more before continuing. After all, it is possible that some of you went to the Wikipedia page for context-free grammars and were (like me) unable to curb the involuntary reflex to close the tab as soon as that stack of ugly math symbols came into view. Anyway, lets get to it. To make a grammar you need the following:

One or more terminal symbols.
One or more non-terminal symbols.
A start symbol (must be non-terminal).
Production rules that relate a non-terminal symbol to a set of symbols.

Terminal symbols are the output of the grammar and are usually written in lower case. Non-terminal symbols are temporary - they cannot appear in a generated string and are designed solely for consumption via production rules. They are usually written using upper case. Ok, so lets define a grammar: Terminal(s): a, b Non-terminal(s): S Start symbol: S The only thing left is the production rule(s):

S => aS | b

This rule says that the non-terminal S can be replaced with either aS or b. Given this fact, it should be apparent that this grammar produces strings that consist of zero or more a's and end with a b. Just to be clear, let me show you the 4 smallest strings that it can produce:

b ab aab aaab

And now some strings that this grammar cannot produce:

a ba baab abaab

So now that we have that clear, the question is how would you write a program that would return true for the top 4 and false for the bottom 4. Ok thats not the question - its trivial. But given an arbitrary set of production rules... now thats much more tricky. In my previous post I proposed a grammar for my calculator and discussed some valid and invalid strings. From what I've read so far, it looks like that getting an implementation that will return true for the valid strings and false for the invalid ones will actually take me 90% of the way to having an implementation for building a parse tree based on the grammar.