The Fantastic SUBST Function (part 2)

This post is part of a series of articles about LISP, a function called SUBST, Clojure, and other interesting stuff. You may want to read the previous post before continuing with this post.

Part 1: Introducing LISP

In part 1 of this series, we learned about the nine special forms of LISP, the building blocks that make up the language. In this part we will go through how the SUBST function works on a small tree.

When John McCarthy in 1960 wrote his famous paper on the programming language LISP, he used a particular function to illustrate what you could do with the language. The function was called SUBST, and it could go through any tree and replace all leaves that had one value with another value. Pretty-printed, it looked something like this:

(label subst
       (lambda (x y z)
               (cond ((atom? z) (cond ((eq? z y) x) ((quote t) z)))
                     ((quote t) (cons (subst x y (car z))
                                      (subst x y (cdr z)))))))

The function takes three arguments: x, y, and z. The x is the new value we want, the y is the value we want to replace, and the z is the tree. Let’s rewrite the function using better argument names. We’ll also use the short form for quote, which is the single quote character: '. Note that 't is how true was written in LISP in those days.

(label subst
       (lambda (to from tree)
               (cond ((atom? tree) (cond ((eq? tree from) to) ('t tree)))
                     ('t (cons (subst to from (car tree))
                               (subst to from (cdr tree)))))))

That’s better. Now, let’s test the SUBST function with the following tree:

We can describe the above tree as a list:

(1 2 (1 2 3) 4)

Now, let’s say that we want to replace 2 with 5 in the tree. We simply call SUBST with 2, 5, and the list above (quoted, of course):

(subst 2 5 '(1 2 (1 2 3) 4))
=> (1 5 (1 5 3) 4)

The resulting tree is:

In order to see what happened here, we need to look closer at the body of the SUBST function:

               (cond ((atom? tree) (cond ((eq? tree from) to) ('t tree)))
                     ('t (cons (subst to from (car tree))
                               (subst to from (cdr tree)))))))

As you might remember from part 1, the cond special form has a number of conditional/consequence pairs:

(cond (conditional1 consequence1)
      (conditional2 consequence2)
      ...
      (true fallback-consequence))

On line 3, we see the first conditional expression: (atom? tree), and on line 4 we see the fallback conditional expression: 't. If the input argument tree actually is an atom, the first conditional expression will evaluate to true, and the consequence expression, which happens to be a second cond, is evaluated. If the tree is not an atom, some cons expression is evaluated. Let’s focus on the fallback case first. After all, the initial call to subst is with a list.

(atom? z) is false

If tree is not an atom, then this expression will be evaluated:

(cons (subst to from (car tree))
      (subst to from (cdr tree)))

So what does this do? We learned in part 1 that the cons special form prepends its first argument to its second argument. We also know that the car special form takes the first element from its list argument, which is our tree. And we know that cdr takes the rest of the list. The first time, tree will be:

(1 2 (1 2 3) 4)

We can then rewrite the cons expression for the first round as:

(cons (subst to from 1)
      (subst to from '(2 (1 2 3) 4)))

Interesting. Now we have two calls to subst, where one actually is with an atom. So let’s look at that case.

(atom? z) is true

If tree is an atom, then this expression will be evaluated:

(cond ((eq? tree from) to) ('t tree)))

OK, so first cond will check whether tree is equal to from. If it is, to is returned. Otherwise, the fallback consequence is evaluated, which returns tree. In other words, if there is a match, we return the replacement, otherwise we return the original. For the first case in our cons expression earlier, tree will be the number 1. It’s not equal to our from argument, which is 2, so the 1 is returned unchanged. The cons expression can thus be simplified to:

(cons 1
      (subst to from '(2 (1 2 3) 4)))

Going back and forth

The second call to subst in our cons expression earlier looked like this:

(subst to from '(2 (1 2 3) 4))

Again, tree is not an atom, so it will go to the fallback branch, which will evaluate to:

(cons (subst to from 2)
      (subst to from '((1 2 3) 4)))

The first call to subst will enter the replacement algorithm, which we now know will try to replace the number 2 with the number 5:

(cons 5
      (subst to from '((1 2 3) 4)))

Together with the first cons, we now have:

(cons 1
      (cons 5
            (subst to from '((1 2 3) 4))))

Sub trees

One interesting case remains: How do we handle a sub-tree? If we continue our evaluation, we get:

(cons 1
      (cons 5
            (cons (subst to from '(1 2 3))
                  (subst to from '(4)))))

Here comes the sub-tree case:

(cons 1
      (cons 5
            (cons (cons (subst to from 1)
                        (subst to from '(2 3)))
                  (cons (subst to from 4)
                        (subst to from nil)))))
(cons 1
      (cons 5
            (cons (cons 1
                        (cons (subst to from 2)
                              (subst to from '(3))))
                  (cons 4
                        nil))))
(cons 1
      (cons 5
            (cons (cons 1
                        (cons 5
                              (cons (subst to from 3)
                                    (subst to from nil))))
                  (cons 4
                        nil))))
(cons 1
      (cons 5
            (cons (cons 1
                        (cons 5
                              (cons 3
                                    nil)))
                  (cons 4
                        nil))))

Which is the same as:

(1 5 (1 5 3) 4)

Recursion

As you can probably see by now, subst will successively chew its way through the original list, chop off the first element, send it to itself, check whether it’s an atom or not, and do the replacement if the atom matches the from argument. Each element will be cons’ed onto the ones that follow.

This concludes the second part of the series. It took more space than I anticipated, so the Clojure implementation of SUBST has to wait for the next part.

This Post Has One Comment

Leave a Reply