A Decimal to Roman Numeral converter in just a few lines

I recently rolled my own Decimal to Roman Numeral converter in Clojure. I was inspired by some tweets about a roman numeral code kata.

The Decimal to Roman Numeral converter takes a decimal number (like 2012) and returns a string of roman numerals (like "MMXII"). The code that I came up with is around ten lines that look like this:

(def mapping (sorted-map-by >  1000 "M" 900 "CM" 500 "D" 400 "CD"
                                100 "C"  90 "XC"  50 "L"  40 "XL"
                                 10 "X"   9 "IX"   5 "V"   4 "IV"
                                  1 "I"))

(defn roman-numeral [i]
  (loop [i i s ""]
    (if (zero? i)
      s
      (let [[[n c]] (filter #(>= i (first %)) mapping)]
        (recur (- i n) (str s c))))))

Let me try to explain it.

Overall, it’s pretty easy. There’s a loop in which we check whether the number i is zero or not. If it is, the current value of the string s is returned. If the number i is not zero, let something something is being done, and we take another whirl in the loop using recur, with the number i decremented by some n and the string s appended with some c.

There are a few things going on:

  • mapping
  • filter
  • destructuring
  • loop/recur

Let’s go through them in turn.

mapping

I need the mapping data structure to be a sorted map. Clojure has a literal syntax for hash maps, using curly braces (the => indicates what the code evaluates to):

{10 "X" 9 "IX" 5 "V" 4 "IV" 1 "I"}
=> {1 "I", 10 "X", 4 "IV", 5 "V", 9 "IX"}

Hash maps are not sorted, so obviously that won’t do. How about sorted-map? Then it will be sorted by key, ascending:

(sorted-map 10 "X" 9 "IX" 5 "V" 4 "IV" 1 "I")
=> {1 "I", 4 "IV", 5 "V", 9 "IX", 10 "X"}

I want it to be sorted descending, so I use sorted-map-by and pass in a Comparator. The Clojure predicate > (greater-than) implements java.util.Comparator, so I can just pass that:

(sorted-map-by > 10 "X" 9 "IX" 5 "V" 4 "IV" 1 "I")
=> {10 "X", 9 "IX", 5 "V", 4 "IV", 1 "I"}

filter

The filter function returns a sequence containing those elements that fulfill the given predicate. Here is the code that performs the filtering:

(filter #(>= i (first %)) mapping)

There are a few funny things going on here. Firstly, the predicate function to filter looks weird. Normally, anonymous functions in Clojure are written like this:

(fn [x]
  (>= i (first x)))

However, there is a shorter syntax that skips the (fn ...) and the named argument, using % (or %1, %2 if more than one) as the argument placeholder. This is equivalent to the above, only shorter:

#(>= i (first %))

The other thing that might look weird is the call to first. Wasn’t mapping a map? What is passed in to the predicate, really? Well, it turns out that it’s quite convenient to be able to treat a map as a sequence. So, Clojure has a literal syntax for a map as a logical sequence. We can see that if we call seq on it:

(seq (sorted-map-by > 10 "X" 9 "IX" 5 "V" 4 "IV" 1 "I"))
=> ([10 "X"] [9 "IX"] [5 "V"] [4 "IV"] [1 "I"])

Here we see that a map can logically be seen as a sequence of vectors containing key-value pairs. The filter predicate will be passed a vector containing a key-value pair. Therefore, first in the predicate will return the key. So, the filter will return all key-value pairs that have a key that is less than or equal to the number i. For example:

(filter #(>= 6 (first %)) mapping)
=> ([5 "V"] [4 "IV"] [1 "I"])

(filter #(>= 4 (first %)) mapping)
=> ([4 "IV"] [1 "I"])

destructuring

Clojure supports a convenient form of picking apart a data structure and naming its parts, where you would normally bind the whole data structure to a single variable. Suppose we know that our call to filter returns a sequence of numbers:

(filter odd? [1 2 3 4 5])
=> (1 3 5)

Say we’re only interested in the first and the second elements. Without destructuring, we would manually retrieve them like this:

(let [lst (filter odd? [1 2 3 4 5])
      fst (first lst)
      snd (second lst)]
  (println fst snd))
=> 1 3

With destructuring, we can directly get and name those parts in the let binding by placing the names in a vector, in which place we previously had just the variable name lst:

(let [ [fst snd] (filter odd? [1 2 3 4 5])]
  (println fst snd))
=> 1 3

Suppose now that the filter function instead returns a sequence of key-value pairs, as in our roman-numeral:

(filter #(>= 4 (first %)) mapping)
=> ([4 "IV"] [1 "I"])

Destructuring can be done with arbitrary depth, so if we want to: a) get the first element of a list of key-value pairs, and then: b) break that element apart and name the key n and the value c, we’ll write:

(let [ [ [n c] ] (filter ...)]
  (recur ...))

So, in just seven characters, [[n c]], we have taken the first element of the filter result, and also named that element’s parts n and c respectively. Thanks to the sorted map, the first element will be the highest matching number and its corresponding roman numeral.

loop/recur

Without going into too much details, loop and recur is Clojure’s way of working around the lack of tail call optimization in the JVM. We’ll use recur to perform a recursive call without any risk of emptying the stack. We’ll use loop to explicitly provide a recursion point with the number of variables we need, and to initialize those variables.

(defn roman-numeral [i]
  (loop [i i s ""]
    ...)

Here we say that we want two loop variables, i and s, where i will be initialized to the value of the parameter i, and s will be initialized to the empty string. Variables have lexical scope, so there’s no problem shadowing the parameter i with the looping variable i. Inside loop, i will always refer to the looping variable i. I could have used another looping variable, like j or k, but I didn’t.

(recur (- i n) (str s c))

Here we say that we want to call the recursion point with new values of i and s, where i will be the current value of i minus the n we got from destructuring the key-value pair from filter, and s will be the value of the current s concatenated with the c that we got from the destructuring.

I still don’t get the loop/recur thing…

Perhaps it’ll be more clear if I begin by defining a separate private helper function for the recursion, and just call it with some initial values from the roman-numeral function:

(defn- roman-helper [i s]
  (if (zero? i)
    s
    (let [[[n c]] (filter #(>= i (first %)) mapping)]
      (roman-helper (- i n) (str s c)))))

(defn roman-numeral [i]
  (roman-helper i ""))

The recursive call from roman-helper to itself will require stack space. We could change the recursive call to a recur instead, thus eliminating the risk for stack overflow. This works since recur will use the top of the function if there is no corresponding loop. Note that roman-helper has been replaced with recur:

(defn- roman-helper [i s]
  (if (zero? i)
    s
    (let [[[n c]] (filter #(>= i (first %)) mapping)]
      (recur (- i n) (str s c)))))

(defn roman-numeral [i]
  (roman-helper i ""))

We can place the helper function inside roman-numeral, as a local anonymous function that we’ll bind to a variable name in a let:

(defn roman-numeral [i]
  (let [roman-helper (fn [i s]
                       (if (zero? i)
                         s
                         (let [[[n c]] (filter #(>= i (first %)) mapping)]
                           (recur (- i n) (str s c)))))]
    (roman-helper i "")))

Remember that loop not only defines an explicit recursion point, but also initializes any loop variables. From here, it’s a small step to replace the helper function and the call to it with just a loop that initializes i and s:

(defn roman-numeral [i]
  (loop [i i s ""]
    (if (zero? i)
      s
      (let [[[n c]] (filter #(>= i (first %)) mapping)]
        (recur (- i n) (str s c))))))

Example: (roman-numeral 6)

Let’s run through an example where we call roman-numeral with 6.

The first round, i will be initialized to 6, and s will be initialized to "". Of course, i is not zero, so filter will be called and return ([5 “V”] [4 “IV”] [1 “I”]) . The let will destructure the first element into n=5 and c="V". Finally, recur will be called with (- 6 5)  and (str “” “V”) .

The next round in the loop will have i=1 and s="V". The call to filter will return ([1 “I”]) , recur will then be called with (- 1 1)  and (str “V” “I”) .

In the final round in the loop, i will be zero, and s will be returned, resulting in the string "VI", which is the correct roman numeral for 6.

This Post Has 7 Comments

  1. Just for reference, there’s also the built-in way:

    (defn roman-cl [n]
    (clojure.pprint/cl-format nil “~@R” n))

  2. Thanks for sharing your versions. The Common Lisp format replica looks pretty nice.

    Your optimized version seems to be around five times faster than both my version and cl-format.

  3. Hi Ulrik,

    I like your very elegant solution.

    Speaking of optimization, I see couple of tricks that speed things up 100 fold.

    – Since you’re only using the first value of filter you may use ‘some’ and thus less destructuring :
    (let [[n c] (some #(when (>= i (first %)) %) mapping)] …)

    – Since your accumulating strings you may concat them all at the end using a vector for ‘s’, using ‘conj’ in ‘recur’ and applying ‘str’ to get the result.

    Then you got something still elegant and quick enough for the job

    (defn roman-numeral2 [i]
    (loop [i i s []]
    (if (zero? i)
    (apply str s)
    (let [[n c]
    (some #(when (>= i (first %)) %) mapping)]
    (recur (- i n) (conj s c))))))

    user=> (time (count (roman-numeral 98765432)))
    “Elapsed time: 12094.501546 msecs”
    98772
    user=> (time (count (roman-numeral2 98765432)))
    “Elapsed time: 124.735959 msecs”
    98772

    Cheers

Leave a Reply

Close Menu