This post is part of a series of articles about LISP, a function called SUBST, Clojure, and other interesting stuff. You may want to read the previous post before continuing with this post.

In part 1 of this series, we learned about the nine special forms of LISP, the building blocks that make up the language. In this part we will go through how the SUBST function works on a small tree.

When John McCarthy in 1960 wrote his famous paper on the programming language LISP, he used a particular function to illustrate what you could do with the language. The function was called SUBST, and it could go through any tree and replace all leaves that had one value with another value. Pretty-printed, it looked something like this:

(label subst (lambda (x y z) (cond ((atom? z) (cond ((eq? z y) x) ((quote t) z))) ((quote t) (cons (subst x y (car z)) (subst x y (cdr z)))))))

The function takes three arguments: `x`

, `y`

, and `z`

. The `x`

is the new value we want, the `y`

is the value we want to replace, and the `z`

is the tree. Let’s rewrite the function using better argument names. We’ll also use the short form for `quote`

, which is the single quote character: `'`

. Note that `'t`

is how `true`

was written in LISP in those days.

(label subst (lambda (to from tree) (cond ((atom? tree) (cond ((eq? tree from) to) ('t tree))) ('t (cons (subst to from (car tree)) (subst to from (cdr tree)))))))

That’s better. Now, let’s test the SUBST function with the following tree:

We can describe the above tree as a list:

(1 2 (1 2 3) 4)

Now, let’s say that we want to replace 2 with 5 in the tree. We simply call SUBST with 2, 5, and the list above (quoted, of course):

(subst 2 5 '(1 2 (1 2 3) 4)) => (1 5 (1 5 3) 4)

In order to see what happened here, we need to look closer at the body of the SUBST function:

(cond ((atom? tree) (cond ((eq? tree from) to) ('t tree))) ('t (cons (subst to from (car tree)) (subst to from (cdr tree)))))))

As you might remember from part 1, the `cond`

special form has a number of conditional/consequence pairs:

(cond (conditional1 consequence1) (conditional2 consequence2) ... (true fallback-consequence))

On line 3, we see the first conditional expression: `(atom? tree)`

, and on line 4 we see the fallback conditional expression: `'t`

. If the input argument `tree`

actually is an atom, the first conditional expression will evaluate to `true`

, and the consequence expression, which happens to be a second `cond`

, is evaluated. If the tree is not an atom, some `cons`

expression is evaluated. Let’s focus on the fallback case first. After all, the initial call to `subst`

is with a list.

## (atom? z) is false

If `tree`

is not an atom, then this expression will be evaluated:

(cons (subst to from (car tree)) (subst to from (cdr tree)))

So what does this do? We learned in part 1 that the `cons`

special form prepends its first argument to its second argument. We also know that the `car`

special form takes the first element from its list argument, which is our `tree`

. And we know that `cdr`

takes the rest of the list. The first time, `tree`

will be:

(1 2 (1 2 3) 4)

We can then rewrite the `cons`

expression for the first round as:

(cons (subst to from 1) (subst to from '(2 (1 2 3) 4)))

Interesting. Now we have two calls to `subst`

, where one actually is with an atom. So let’s look at that case.

## (atom? z) is true

If `tree`

is an atom, then this expression will be evaluated:

(cond ((eq? tree from) to) ('t tree)))

OK, so first `cond`

will check whether `tree`

is equal to `from`

. If it is, `to`

is returned. Otherwise, the fallback consequence is evaluated, which returns `tree`

. In other words, if there is a match, we return the replacement, otherwise we return the original. For the first case in our cons expression earlier, `tree`

will be the number 1. It’s not equal to our `from`

argument, which is 2, so the 1 is returned unchanged. The `cons`

expression can thus be simplified to:

(cons 1 (subst to from '(2 (1 2 3) 4)))

## Going back and forth

The second call to `subst`

in our `cons`

expression earlier looked like this:

(subst to from '(2 (1 2 3) 4))

Again, `tree`

is not an atom, so it will go to the fallback branch, which will evaluate to:

(cons (subst to from 2) (subst to from '((1 2 3) 4)))

The first call to subst will enter the replacement algorithm, which we now know will try to replace the number 2 with the number 5:

(cons 5 (subst to from '((1 2 3) 4)))

Together with the first `cons`

, we now have:

(cons 1 (cons 5 (subst to from '((1 2 3) 4))))

## Sub trees

One interesting case remains: How do we handle a sub-tree? If we continue our evaluation, we get:

(cons 1 (cons 5 (cons (subst to from '(1 2 3)) (subst to from '(4)))))

Here comes the sub-tree case:

(cons 1 (cons 5 (cons (cons (subst to from 1) (subst to from '(2 3))) (cons (subst to from 4) (subst to from nil)))))

(cons 1 (cons 5 (cons (cons 1 (cons (subst to from 2) (subst to from '(3)))) (cons 4 nil))))

(cons 1 (cons 5 (cons (cons 1 (cons 5 (cons (subst to from 3) (subst to from nil)))) (cons 4 nil))))

(cons 1 (cons 5 (cons (cons 1 (cons 5 (cons 3 nil))) (cons 4 nil))))

Which is the same as:

(1 5 (1 5 3) 4)

## Recursion

As you can probably see by now, `subst`

will successively chew its way through the original list, chop off the first element, send it to itself, check whether it’s an atom or not, and do the replacement if the atom matches the from argument. Each element will be cons’ed onto the ones that follow.

This concludes the second part of the series. It took more space than I anticipated, so the Clojure implementation of SUBST has to wait for the next part.

Pingback: The Fantastic SUBST Function (part 1) | Jayway Team Blog - Sharing Experience