# Implement an Interpreter using Clojure Instaparse

05 Jul 2024This is a small example on how to implement an interpreter using Clojure and the Instaparse library.

#### Dependencies

First we create a *deps.edn* file to get Rich Hickey’s Clojure, Mark Engelberg’s Instaparse, the Midje test suite by Brian Marick, and my modified version of Max Miorim’s midje-runner:

#### Initial setup

Next we create a test suite with an initial test in *test/clj_calculator/t_core.clj*:

You can run the test suite as follows which should give an error.

Next we create a module with the parser in *src/clj_calculator/core.clj*:

We also need to create an initial grammar in *resources/clj_calculator/calculator.bnf* defining a minimal grammar and a regular expression for parsing an integer:

At this point the first test should pass.

#### Ignoring whitespace

Next we add a test to ignore whitespace.

The test should fail with unexpected input. The grammar needs to be modified to pass this test:

Note the use of ‘<’ and ‘>’ to omit the parsed whitespace from the parse tree.

#### Parsing expressions

Next we can add tests for sum, difference, or product of two numbers:

The grammar now becomes:

#### Transforming syntax trees

Instaparse comes with a useful transformation function for recursively transforming the abstract syntax tree we obtained from parsing. First we write and run a failing test for transforming a string to an integer:

To pass the test we implement a calculator function which transforms the syntax tree. Initially it only needs to deal with the nonterminal symbols START and NUMBER:

#### Performing calculations

Obviously we can use the transformation function to also perform the calculations. Here are the tests for the three possible operations of the parse tree.

The implementation using the Instaparse transformation function is quite elegant:

#### Recursive Grammar

The next test is about implementing an expression with two operations.

A naive implementation using a blind EXPR nonterminal symbol passes the test:

However there is a problem with this grammar: It is ambiguous. The following failing test shows that the parser could generate two different parse trees:

When parsing small strings, this might not be a problem. However if you use an ambiguous grammar to parse a large file with a syntax error near the end, the resulting combinatorial explosion leads to a long processing time before the parser can return the syntax error. The good thing is, that Instaparse uses the GLL parsing algorithm, i.e. it can handle a left-recursive grammar to resolve the ambiguity:

This grammar is not ambiguous any more and will pass above test.

#### Grouping using brackets

We might want to use brackets to group expressions and influence the order expressions are applied:

The following grammar implements this:

A final consideration is operator precedence of multiplication over addition and subtraction. I leave this as an exercise for the interested reader ;)

#### Main function

Now we only need a main function to be able to use the calculator program.

Now one can run the program as follows:

To exit the calculator, simply press CTRL+D.

See github.com/wedesoft/clj-calculator for source code.

Enjoy!