The Fantastic SUBST Function (part 3)

This post is part of a series of articles about LISP, a function called SUBST, Clojure, and other interesting stuff. You may want to read the previous posts before continuing with this post.

Part 1: Introducing LISP
Part 2: Walking through SUBST

In part 1 of this series, we learned about the nine special forms of LISP, the building blocks that make up the language. In part 2, we went through how the SUBST function works on a small tree. In this part, we will show how SUBST can be written in a modern LISP: Clojure.

John McCarthy wrote in 1960 a paper on the programming language LISP. In the paper, he used the SUBST function to illustrate what you could do with the language. The function could go through any tree and replace all leaves that had one value with another value. Pretty-printed and with better argument names, it looked something like this:

(label subst
       (lambda (to from tree)
               (cond ((atom? tree) (cond ((eq? tree from) to) ('t tree)))
                     ('t (cons (subst to from (car tree))
                               (subst to from (cdr tree)))))))

If we want to replace 2 with 5 in a tree, we call SUBST with 2, 5, and the tree as a quoted list:

(subst 2 5 '(1 2 (1 2 3) 4))
=> (1 5 (1 5 3) 4)

Rewriting SUBST in Clojure

We will rewrite this function step by step in Clojure, which is a modern LISP that, among other things:

  • is dynamically typed
  • runs on the JVM
  • compiles on the fly to Java byte code
  • has literal support for lists, vectors, maps, and sets
  • has strong support for multithreading

Here are a few differences between Clojure and traditional LISP that we will take advantage of.

Better names

Clojure has changed some of the names (to the better, in my opinion), such as fn instead of lambda, def instead of label, first instead of car, and next instead of cdr.

Vectors instead of lists

The syntax for a literal vector in Clojure uses square brackets, []. A vector of three numbers looks like this:

[1 2 3]

In contrast with the list, which is treated as a call to the first element unless it’s quoted, a vector doesn’t need to be quoted. It always evaluates to itself:

[1 2 3]
=> [1 2 3]

The elements in a vector can be accessed by index in constant time. For these reasons, and also because square brackets visually stand out a little from all the parentheses in LISP, vectors are sometimes used instead of lists in Clojure. One such example is the argument list to a function:

(fn [x] (do stuff to x))

No pair lists in COND

The cond form in Clojure doesn’t need the conditional and the consequence paired up in lists. Clojure will partition these from a single sequence of conditional, consequence, conditional, consequence, etc. This means fewer parentheses in Clojure.


(cond (conditional1 consequence1)
      (conditional2 consequence2))

In Clojure:

(cond conditional1 consequence1
      conditional2 consequence2)

Keywords evaluate to “not false”

For the fallback conditional, LISP used 't. In Clojure, we can use anything that doesn’t evaluate to nil or false. A keyword is like a symbol, but it always evaluates to itself:

=> :foo

They are useful as keys in maps, for example, or to symbolize arbitrary field names such as :firstname, :species, :age, etc. And since they always evaluate to themselves, they certainly don’t evaluate to nil or false. Thus, the convention is to use the keyword :else as the fallback conditional in cond forms:

(cond conditional1 consequence1
      conditional2 consequence2
      :else fallback-consequence)

First Clojure version (not working)

Armed with this information, let’s take the first step towards a Clojure version of SUBST:

(def subst
     (fn [to from tree]
         (cond (atom? tree) (cond (eq? tree from) to :else tree)
               :else (cons (subst to from (first tree))
                           (subst to from (next tree))))))

We have two problems, though. One: there is no eq? form in Clojure, and two: there is no atom? form.

Second Clojure version (working)

Instead of the eq? form, Clojure provides the = function.

In the original LISP, there were just atoms and lists. Clojure has several collection types and the atom? form has lost its importance. Instead, there is a coll? form that returns true if the argument is some type of collection. Sort of the inverse of the atom? form. That’s good enough for us, but we must remember to enclose it in not.

We end up with this working Clojure version of SUBST:

(def subst
     (fn [to from tree]
         (cond (not (coll? tree)) (cond (= tree from) to :else tree)
               :else (cons (subst to from (first tree))
                           (subst to from (next tree))))))

We can actually paste this into a Clojure REPL, evaluate it, and run it. Check the Clojure Getting Started guide for more information. We will assume here that you have Leiningen installed. It’s the quickest way of getting a REPL running:

% lein repl
REPL started; server listening on localhost port 60145

It started a REPL and placed us in the default namespace, user. We can play with it a little. Let’s see if it can add. Type this in: (+ 1 2 3)

user=> (+ 1 2 3)

OK, it works. Now we paste the SUBST function definition into the REPL:

user=> (def subst
     (fn [to from tree]
         (cond (not (coll? tree)) (cond (= tree from) to :else tree)
               :else (cons (subst to from (first tree))
                           (subst to from (next tree))))))

OK, it evaluated the code in the current namespace, and returned the resulting reference to it: #'user/subst. Let’s try the function:

user=> (subst 5 2 '(1 2 (1 2 3) 4))
(1 5 (1 5 3) 4)

Third Clojure version

We can go further, though. Clojure has a defn macro that simplifies defining functions. It combines both the def and the fn into a single step:

(defn subst [to from tree] ...)

There is also an if form in Clojure. It’s a little simpler than cond, in that it only handles a single then-expression and a single (optional) else-expression:

(if conditional

We get this third Clojure version of SUBST:

(defn subst [to from tree]
      (if (not (coll? tree))
          (if (= tree from) to tree)
          (cons (subst to from (first tree))
                (subst to from (next tree)))))

It certainly gets better and better, don’t you think? Paste that into the REPL and try it.

Can we really improve this any further? Well, there is still one thing we can do, but I’ll save that for the next part.

This Post Has 5 Comments

  1. Kris

    One simple thing I’d do (in any language) is consider replacing:

    (if (not x)

    …with the positive form:

    (if x

    Positive assertions are generally shorter & easier to read. :-)

  2. Ulrik Sandberg

    I’m glad you noticed that. I actually refrained from doing this in order to keep the original structure, but I agree that it looks better.

    (defn subst [to from tree]
    (if (coll? tree)
    (cons (subst to from (first tree))
    (subst to from (next tree)))
    (if (= tree from) to tree)))

  3. Matheus

    There is no dedicated mliaing list yet, but general questions can be posted to the Clojure Google group or the #clojure irc channel, and of course I’m happy to answer questions by email (liebke at gmail).David

Leave a Reply