Sunday, 20 January 2008

Discovering compilers

My first exercise for learning how compilers work is to write a program that takes a string containing a mathematical expression and evaluates the result e.g. "2+3*7" => 23. I will be implementing it using Ruby and will write unit tests as I go. Since my goal is to learn the underlying concepts of compiler construction, I am avoiding any libraries/functionality provided by Ruby that would make it too easy. The first implementation will ignore whitespace and accept strings containing integers and the 4 standard operators (plus, minus, multiply, divide). The second implementation will report syntax errors and accept decimal numbers and parentheses. According to my research on Wikipedia, this exercise can be broken into the following steps:

  1. lexical analysis (identifying the tokens that appear in the string)
  2. syntactic analysis (assembling the tokens into a tree)
  3. semantic analysis (evaluating the tree for a result)
I will implement the lexical analysis step using regular expressions. They are well suited for the task and I need the practice with them. The syntactic analysis is the tricky bit because I have to follow order of operations. The semantic analysis will be trivial as it simply requires a recursive algorithm executed on the root node of the tree. Next up: Implementing lexical analysis.

No comments:

Post a comment