## Monday, 28 January 2008

### Evaluating the expression tree

Lets start off with some tests:

``````class EvaluateTests < Test::Unit::TestCase
def test_1
assert_equal(1, calculate("1"))
end

assert_equal(3, calculate("1+2"))
end

def test_subtract
assert_equal(3, calculate("7-4"))
end

def test_multiply
assert_equal(6, calculate("3*2"))
end

def test_divide
assert_equal(3, calculate("6/2"))
end

def test_negative_result
assert_equal(-1, calculate("1-2"))
end

assert_equal(14, calculate("4*3+2"))
assert_equal(14, calculate("2+3*4"))
assert_equal(10, calculate("4+3*2"))
end

def test_left_to_right
assert_equal(8, calculate("12-6+2"))
assert_equal(4, calculate("12/6*2"))
end
end
``````
Not exactly comprehensive, but its a start. I need to write a calculate method to tie together the work I've done over the last few weeks:
``````def calculate(expression)
tokens = tokenize(expression)
root_node = build_tree(tokens)
return root_node.evaluate()
end
``````
Evaluating the tree is obviously a recursive function. Here is the first bit:
``````class BinaryTreeNode
def evaluate()
return @token.evaluate(self)
end
end
``````
I decided the easiest way to do this is have the evaluate method take an argument: the tree node that contains the token. When evaluating an IntegerToken the value is simply returned, no need to use the argument:
``````class IntegerToken < Token
def evaluate(containerNode)
return @value
end
end
``````
For the OperatorToken, we want: evaluate(left child) [operator] evaluate (right child). There are lots of different ways to do this - a case statement is the simplest:
``````class OperatorToken
def evaluate(containerNode)
case @value
when "+"
return containerNode.lhs.evaluate() + containerNode.rhs.evaluate()
when "-"
return containerNode.lhs.evaluate() - containerNode.rhs.evaluate()
when "*"
return containerNode.lhs.evaluate() * containerNode.rhs.evaluate()
when "/"
return containerNode.lhs.evaluate() / containerNode.rhs.evaluate()
end
end
end
end
``````
My my isn't that boring. Another option would be to implement subclasses for OperatorToken - I wouldn't learn much from doing it though. Lets try something different instead:
``````class OperatorToken < Token

def initialize(value)
@value = value
@eval_method = Fixnum.instance_method(value)
case @value
when "+","-"
@major = false
when "*","/"
@major = true
end
end

def evaluate(containerNode)
bound_method = @eval_method.bind(containerNode.lhs.evaluate())
return bound_method.call(containerNode.rhs.evaluate())
end
end
``````
The three relevant lines of code are highlighted. But first, some background. Operators in Ruby are actually just methods. When I execute the line x = 2 + 3 I am calling the '+' method on the object '2' which is an instance of the class Fixnum. So what I am actually doing with that line is this: x = 2.+(3) . Classes in Ruby are just like normal objects, so you can call methods on them as I am doing in the first highlighted line. If the OperatorToken is constructed with "+", then I acquire an instance of the '+' method defined on the class Fixnum and store it as @eval_method. What I actually store is an instance of UnboundMethod - if you are a .NET developer, it might help to think of it as similar to MethodInfo as you can use it to invoke the method, but you will need to specify the instance you wish to invoke the method on. The second highlighted line does this - the left node is evaluated and the result is our bind target. The third line invokes the bound method and passes the result of evaluating the right node. The result is what we are after and is returned. Yes, this is much more complicated than a switch statement, and surely much slower. But I learned something, and hopefully so did you. Lets run my unit tests:
``````>ruby calculator.rb
Started
........................
Finished in 0.022 seconds.

24 tests, 37 assertions, 0 failures, 0 errors
>Exit code: 0
``````
I think its worth mentioning that I have sort of covered up some mistakes I made while implementing the tree building. I didn't initially write the left to right unit tests and so for a while my calculator could do "2+9/3*4" but not "9/3*4" (it returned zero rather than 12). I didn't notice this problem until I ran the stress tester:
``````def stress_test()
expressions = generate_expressions(100)
rejects = []
expressions.each do |exp|
rejects << exp if (eval(exp) != calculate(exp))
end
return rejects
end

def generate_expressions(count)
expressions = []
operators = ["+","-","*","/"]
count.times do
exp = (rand(100) + 1).to_s
rand(10).times do
exp << operators[rand(4)]
exp << (rand(100) + 1).to_s
end
expressions << exp
end
return expressions
end
``````
This baby will generate 100 expressions, each containing up to 10 terms between 1 and 100. It does not generate zeros because I would then have to deal with division by zero problems (I'll let that wait for version two). The eval method is not mine - it is a built in method that uses Ruby's own expression evaluation engine. This is really useful because I can pass an expression string into my program and compare the result with what Ruby came up with. If there is a discrepancy then I add it to the array of rejects and return them when done. The stress test revealed the aforementioned problem with left to right evaluation which I fixed, but it also raised something else. Consider this test:
``````def test_really_big_number
assert_equal(1000000000000000000001, calculate("1000000000000000000000+1"))
end
``````
Unfortunately, it doesn't pass:
``````>ruby calculator.rb
Started
........E................
Finished in 0.02 seconds.

1) Error:
test_really_big_number(EvaluateTests):
TypeError: bind argument must be an instance of Fixnum
calculator.rb:55:in `bind'
calculator.rb:55:in `evaluate'
calculator.rb:12:in `evaluate'
calculator.rb:118:in `calculate'
calculator.rb:209:in `test_really_big_number'

25 tests, 37 assertions, 0 failures, 1 errors
>Exit code: 1
``````
When a Fixnum gets larger than a certain size it changes into a Bignum. Unfortunately, this wreaks havoc with my evaluation code because my unbound method will only bind to a Fixnum, not a Bignum. There are some workarounds, a relatively simple one already springs to mind. However I don't think they are worth entertaining - going back to the case statement or something similar seems to be a better idea. Next post I clean up this issue and the code in general.