A Decimal to Roman Numeral converter in just a few lines

I recently rolled my own Decimal to Roman Numeral converter in Clojure. I was inspired by some tweets about a roman numeral code kata.

The Decimal to Roman Numeral converter takes a decimal number (like 2012) and returns a string of roman numerals (like "MMXII"). The code that I came up with is around ten lines that look like this:

(def mapping (sorted-map-by >  1000 "M" 900 "CM" 500 "D" 400 "CD"
                                100 "C"  90 "XC"  50 "L"  40 "XL"
                                 10 "X"   9 "IX"   5 "V"   4 "IV"
                                  1 "I"))

(defn roman-numeral [i]
  (loop [i i s ""]
    (if (zero? i)
      s
      (let [[[n c]] (filter #(>= i (first %)) mapping)]
        (recur (- i n) (str s c))))))

Let me try to explain it.

Overall, it’s pretty easy. There’s a loop in which we check whether the number i is zero or not. If it is, the current value of the string s is returned. If the number i is not zero, let something something is being done, and we take another whirl in the loop using recur, with the number i decremented by some n and the string s appended with some c.

There are a few things going on:

  • mapping
  • filter
  • destructuring
  • loop/recur

Let’s go through them in turn.

mapping

I need the mapping data structure to be a sorted map. Clojure has a literal syntax for hash maps, using curly braces (the => indicates what the code evaluates to):

{10 "X" 9 "IX" 5 "V" 4 "IV" 1 "I"}
=> {1 "I", 10 "X", 4 "IV", 5 "V", 9 "IX"}

Hash maps are not sorted, so obviously that won’t do. How about sorted-map? Then it will be sorted by key, ascending:

(sorted-map 10 "X" 9 "IX" 5 "V" 4 "IV" 1 "I")
=> {1 "I", 4 "IV", 5 "V", 9 "IX", 10 "X"}

I want it to be sorted descending, so I use sorted-map-by and pass in a Comparator. The Clojure predicate > (greater-than) implements java.util.Comparator, so I can just pass that:

(sorted-map-by > 10 "X" 9 "IX" 5 "V" 4 "IV" 1 "I")
=> {10 "X", 9 "IX", 5 "V", 4 "IV", 1 "I"}

filter

The filter function returns a sequence containing those elements that fulfill the given predicate. Here is the code that performs the filtering:

(filter #(>= i (first %)) mapping)

There are a few funny things going on here. Firstly, the predicate function to filter looks weird. Normally, anonymous functions in Clojure are written like this:

(fn [x]
  (>= i (first x)))

However, there is a shorter syntax that skips the (fn ...) and the named argument, using % (or %1, %2 if more than one) as the argument placeholder. This is equivalent to the above, only shorter:

#(>= i (first %))

The other thing that might look weird is the call to first. Wasn’t mapping a map? What is passed in to the predicate, really? Well, it turns out that it’s quite convenient to be able to treat a map as a sequence. So, Clojure has a literal syntax for a map as a logical sequence. We can see that if we call seq on it:

(seq (sorted-map-by > 10 "X" 9 "IX" 5 "V" 4 "IV" 1 "I"))
=> ([10 "X"] [9 "IX"] [5 "V"] [4 "IV"] [1 "I"])

Here we see that a map can logically be seen as a sequence of vectors containing key-value pairs. The filter predicate will be passed a vector containing a key-value pair. Therefore, first in the predicate will return the key. So, the filter will return all key-value pairs that have a key that is less than or equal to the number i. For example:

(filter #(>= 6 (first %)) mapping)
=> ([5 "V"] [4 "IV"] [1 "I"])

(filter #(>= 4 (first %)) mapping)
=> ([4 "IV"] [1 "I"])

destructuring

Clojure supports a convenient form of picking apart a data structure and naming its parts, where you would normally bind the whole data structure to a single variable. Suppose we know that our call to filter returns a sequence of numbers:

(filter odd? [1 2 3 4 5])
=> (1 3 5)

Say we’re only interested in the first and the second elements. Without destructuring, we would manually retrieve them like this:

(let [lst (filter odd? [1 2 3 4 5])
      fst (first lst)
      snd (second lst)]
  (println fst snd))
=> 1 3

With destructuring, we can directly get and name those parts in the let binding by placing the names in a vector, in which place we previously had just the variable name lst:

(let [ [fst snd] (filter odd? [1 2 3 4 5])]
  (println fst snd))
=> 1 3

Suppose now that the filter function instead returns a sequence of key-value pairs, as in our roman-numeral:

(filter #(>= 4 (first %)) mapping)
=> ([4 "IV"] [1 "I"])

Destructuring can be done with arbitrary depth, so if we want to: a) get the first element of a list of key-value pairs, and then: b) break that element apart and name the key n and the value c, we’ll write:

(let [ [ [n c] ] (filter ...)]
  (recur ...))

So, in just seven characters, [[n c]], we have taken the first element of the filter result, and also named that element’s parts n and c respectively. Thanks to the sorted map, the first element will be the highest matching number and its corresponding roman numeral.

loop/recur

Without going into too much details, loop and recur is Clojure’s way of working around the lack of tail call optimization in the JVM. We’ll use recur to perform a recursive call without any risk of emptying the stack. We’ll use loop to explicitly provide a recursion point with the number of variables we need, and to initialize those variables.

(defn roman-numeral [i]
  (loop [i i s ""]
    ...)

Here we say that we want two loop variables, i and s, where i will be initialized to the value of the parameter i, and s will be initialized to the empty string. Variables have lexical scope, so there’s no problem shadowing the parameter i with the looping variable i. Inside loop, i will always refer to the looping variable i. I could have used another looping variable, like j or k, but I didn’t.

(recur (- i n) (str s c))

Here we say that we want to call the recursion point with new values of i and s, where i will be the current value of i minus the n we got from destructuring the key-value pair from filter, and s will be the value of the current s concatenated with the c that we got from the destructuring.

I still don’t get the loop/recur thing…

Perhaps it’ll be more clear if I begin by defining a separate private helper function for the recursion, and just call it with some initial values from the roman-numeral function:

(defn- roman-helper [i s]
  (if (zero? i)
    s
    (let [[[n c]] (filter #(>= i (first %)) mapping)]
      (roman-helper (- i n) (str s c)))))

(defn roman-numeral [i]
  (roman-helper i ""))

The recursive call from roman-helper to itself will require stack space. We could change the recursive call to a recur instead, thus eliminating the risk for stack overflow. This works since recur will use the top of the function if there is no corresponding loop. Note that roman-helper has been replaced with recur:

(defn- roman-helper [i s]
  (if (zero? i)
    s
    (let [[[n c]] (filter #(>= i (first %)) mapping)]
      (recur (- i n) (str s c)))))

(defn roman-numeral [i]
  (roman-helper i ""))

We can place the helper function inside roman-numeral, as a local anonymous function that we’ll bind to a variable name in a let:

(defn roman-numeral [i]
  (let [roman-helper (fn [i s]
                       (if (zero? i)
                         s
                         (let [[[n c]] (filter #(>= i (first %)) mapping)]
                           (recur (- i n) (str s c)))))]
    (roman-helper i "")))

Remember that loop not only defines an explicit recursion point, but also initializes any loop variables. From here, it’s a small step to replace the helper function and the call to it with just a loop that initializes i and s:

(defn roman-numeral [i]
  (loop [i i s ""]
    (if (zero? i)
      s
      (let [[[n c]] (filter #(>= i (first %)) mapping)]
        (recur (- i n) (str s c))))))

Example: (roman-numeral 6)

Let’s run through an example where we call roman-numeral with 6.

The first round, i will be initialized to 6, and s will be initialized to "". Of course, i is not zero, so filter will be called and return ([5 “V”] [4 “IV”] [1 “I”]) . The let will destructure the first element into n=5 and c="V". Finally, recur will be called with (- 6 5)  and (str “” “V”) .

The next round in the loop will have i=1 and s="V". The call to filter will return ([1 “I”]) , recur will then be called with (- 1 1)  and (str “V” “I”) .

In the final round in the loop, i will be zero, and s will be returned, resulting in the string "VI", which is the correct roman numeral for 6.

This Post Has 7 Comments

  1. Steve Miner

    Just for reference, there’s also the built-in way:

    (defn roman-cl [n]
    (clojure.pprint/cl-format nil “~@R” n))

  2. Ulrik Sandberg

    Thanks for sharing your versions. The Common Lisp format replica looks pretty nice.

    Your optimized version seems to be around five times faster than both my version and cl-format.

  3. kawas

    Hi Ulrik,

    I like your very elegant solution.

    Speaking of optimization, I see couple of tricks that speed things up 100 fold.

    – Since you’re only using the first value of filter you may use ‘some’ and thus less destructuring :
    (let [[n c] (some #(when (>= i (first %)) %) mapping)] …)

    – Since your accumulating strings you may concat them all at the end using a vector for ‘s’, using ‘conj’ in ‘recur’ and applying ‘str’ to get the result.

    Then you got something still elegant and quick enough for the job

    (defn roman-numeral2 [i]
    (loop [i i s []]
    (if (zero? i)
    (apply str s)
    (let [[n c]
    (some #(when (>= i (first %)) %) mapping)]
    (recur (- i n) (conj s c))))))

    user=> (time (count (roman-numeral 98765432)))
    “Elapsed time: 12094.501546 msecs”
    98772
    user=> (time (count (roman-numeral2 98765432)))
    “Elapsed time: 124.735959 msecs”
    98772

    Cheers

Leave a Reply