Playing With Syntax

Datetime:2016-08-23 04:41:06          Topic:          Share

One of the things I love about Lisp is that it gives you the ability to change and mold the syntax of the language to what you need. In this post I want to look at the evolution of a little macro I’ve been playing around with for a while now.

Mutation is a fundamental concept in most programming languages. Functional programming may be beautiful, but mutation is still useful (and fast). In most languages assignment is done with an = or := operator:

x = 10

In Common Lisp the operator is named setf instead of = for historical reasons, and of course it’s used in prefix notation:

(setf x 10)

Aside from the prefix ordering, Common Lisp’s syntax is already a bit more elegant because you can set arbitrarily many variables without repeating the assignment operator over and over again:

; Have to type out `=` three times
x = 10
y = 20
z = 30

; But only one `setf` required
(setf x 10
      y 20
      z 30)

setf assigns values one-by-one, in-order. Common Lisp also has psetf which is “parallel setf “. This evaluates all the values first, then sets all the variables at once:

(let ((a 10) (b 20))
  (setf a 50
        b (+ 1 a))   ; => (50 51)

  (psetf a 90
         b (+ 1 a))) ; => (90 52)

Assignment is nice, but often you want to change the value of a variable by taking its current value and computing something with it. If you’re making a ball that travels across the screen you might have something like this:

def move_ball(ball, distance):
    ball.x = ball.x + distance

(defun move-ball (ball distance)
  (setf (slot-value ball 'x)
        (+ (slot-value ball 'x)
            distance)))

If you find slot-value too wordy to type, you could make a macro to get rid of it (and save yourself a quote too). . is already used for something else so you’d need to pick a different character:

(defmacro $ (instance slot-symbol)
  `(slot-value ,instance ',slot-symbol))

(defun move-ball (ball distance)
  (setf ($ ball x)
        (+ ($ ball x)
            distance)))

In practice, though, you rarely use slot-value . Instead you usually use accessor functions:

(defun move-ball (ball distance)
  (setf (ball-x ball)
        (+ (ball-x ball)
            distance)))

Anyway, back to assignment. What we wrote above works , but it’s a bit annoying to have to type the “place” that we’re working with twice. To cut down on this repetition many languages offer operators like += and ++ :

x = x + 10
⬇
x += 10

y = y + 1
⬇
y++

Common Lisp’s version of these is called incf .

(incf x 10)
(incf y)

Notice how the s-expression syntax means we can use the same operator for both of the forms, instead of needing separate += and ++ operators.

Other languages often have similar “in-place” operators for other numeric operations like -= , -- , *= , and so on. Common Lisp has decf for subtraction, but doesn’t have versions for multiplication or division built in. But we can add them with a few macros:

(defmacro mulf (place factor)
  `(setf ,place (* ,place ,factor)))

(defmacro divf (place divisor)
  `(setf ,place (/ ,place ,divisor)))

Unfortunately these are not quite right — we’ve committed the cardinal macro-writing sin of splicing in the place expression multiple times. If place has side effects they’ll happen more times than expected.

Luckily, Common Lisp has a macro that wraps up the boring process of handling this for us. define-modify-macro lets us fix the problem:

(define-modify-macro mulf (factor) *)
(define-modify-macro divf (divisor) /)

Now we’ve got versions of *= and /= in Common Lisp. While we’re here we should do things right and allow divf to be used with just a place, which will mean “set place to 1/place “:

(define-modify-macro divf (&optional divisor)
  (lambda (value divisor)
    (if (null divisor)
      (/ 1 value)
      (/ value divisor))))

(let ((x 10))
  (divf x 2) ; => 5
  (divf x)   ; => 1/5
  (divf x))  ; => 5

Note that we don’t really need the 1 in (/ 1 value) , because Common Lisp’s / works the same way when you call it with a single argument. I’m not sure what behavior would make sense for a unary (mulf place) , so let’s ignore that one and move on.

So far we’ve been talking about the idea of setting a variable to the result of computing some function on its current value. We used define-modify-macro to define some support for extra functions, but defining a mutator macro for every function we might possibly want to use could get tedious. What if we generalize this a bit more?

(define-modify-macro callf (function)
  (lambda (value function)
    (funcall function value)))

(let ((x -1))
  (callf x #'abs)   ; => 1
  (callf x #'1+)    ; => 2
  (callf x #'sin)   ; => 0.9092974
  (callf x #'round) ; => 1
  (callf x #'-))    ; => -1

Now we’ve got something a bit higher-level. With callf we can mutate a variable with a unary function. But what about functions that need more arguments? Again, Common Lisp does the right thing and handles &rest parameters correctly in define-modify-macro :

(define-modify-macro callf (function &rest args)
  (lambda (value function &rest args)
    (apply function value args)))

(let ((x -1))
  (callf x #'abs)    ; => 1
  (callf x #'+ 10)   ; => 11
  (callf x #'* 2 4)) ; => 88

(let ((foo (list 0 1 2 3)))
  (callf foo #'reverse)
  ; => (3 2 1 0)

  (callf foo #'append (list -1 -2 -3)))
  ; => (3 2 1 0 -1 -2 -3)

That’s better. Now we can mutate a variable by applying a function to its current value and some other arguments. This might come in handy when we don’t want to bother defining a full-blown modification macro just to use it once or twice.

At this point we’ve got something similar to Michael Malis’ zap macro. A minor difference is for zap the order of the function and the place are reversed. I believe his reason was that it makes the resulting form look more like the function call that is conceptually being made:

(zap     #'+  x  5)
;          ↓  ↓  ↓
(setf x (  +  x  5  ))

Unfortunately, because the place is no longer the first argument we can’t use define-modify-macro to handle the single-evaluation boilerplate for us any more. Instead we need to use get-setf-expander and work with its results. Malis’ excellent Getting Places article explains how that works, and if you want to fully understand the rest of the code in this post you should read that one first.

The difference between callf and Malis’ zap are just cosmetic. But they’re still not completely flexible. What if we want to call a function, but we need the value to be an argument other than the first? Of course we could use a lambda :

(let ((x (list :coffee :soda :tea)))
  (callf x (lambda (l) (remove :soda l))))

But that is really ugly. It would be nice if we had a way to specify which argument to the function should be the current value. This is Lisp, so we can pick whatever syntax we want. What should we choose? Often a good first step is to look at how other languages do things. Clojure’s anonymous function syntax is one possibility:

(map #(+ 10 %) [1 2 3])
; => (11 12 13)

Clojure uses the magic symbol % as a placeholder for the argument. We can modify Malis’ zap macro to do the same. We’ll call it zapf , and we’ll swap the place back into the first argument like all the other modify macros, since the aesthetic reason no longer holds. The changes are in bold:

; Original version
(defmacro zap (fn place &rest args)
  (multiple-value-bind
        (temps exprs stores store-expr access-expr)
      (get-setf-expansion place)
    `(let* (,@(mapcar #'list temps exprs)
            (,(car stores)
              (funcall ,fn ,access-expr ,@args)))
       ,store-expr)))

; New version
(defmacro zapf (place fn &rest args)
  (multiple-value-bind
        (temps exprs stores store-expr access-expr)
      (get-setf-expansion place)
    `(let* (,@(mapcar #'list temps exprs)
            (,(car stores)
             (funcall ,fn ,@(substitute access-expr '% args))))
      ,store-expr)))

And now we can use % to say where the place’s value should go:

(let ((x (list :coffee :soda :tea)))
  (zapf x #'remove :soda %)
  ; => (:coffee :tea)

  (zapf x #'cons :water %)
  ; => (:water :coffee :tea)

  (zapf x #'append % (list :cocoa :chai)))
  ; => (:water :coffee :tea :cocoa :chai)

Note that substitute will replace all occurrences of % with the access expression, so we can do something like (zapf x #'append % %) if we want.

Some people will find this new version unpleasant, because it effectively captures the % variable. It’s not hygienic, by design. I personally don’t mind it, and I like the brevity it allows.

We’re almost done, but I want to push things just a bit further. Let’s revisit the original example of the ball moving across the screen:

def move_ball(ball, distance):
    ball.x = ball.x + distance

(defun move-ball (ball distance)
  (setf (ball-x ball)
        (+ (ball-x ball) distance)))

Of course we can simply use += or incf here:

def move_ball(ball, distance):
    ball.x += distance

(defun move-ball (ball distance)
  (incf (ball-x ball) distance))

Suppose we also wanted to make the ball wrap around the screen when it flies off one side. We can do this using modulus:

def move_ball(ball, distance, screen_width):
    ball.x += distance
    ball.x %= screen_width

(defun move-ball (ball distance screen-width)
  (incf (ball-x ball) distance)
  (callf (ball-x ball) #'mod % screen-width))

We could define modf if we wanted to make that second call a bit nicer. But let’s take a moment to reflect. Notice how we’re back to having to type out (ball-x ball) twice again. It’s better than typing it four times, but can we do better?

zapf currently takes a place, a function, and a list of arguments. What if we made it more like setf , and just have it take a place and an expression? We could still use % as a placeholder, but could let it appear anywhere in the (possibly nested) expression form.

One way to do this would be to simply replace substitute with subst in the zapf code. (The difference between these extremely-confusingly-named functions is that substitute replaces element of a sequence, and subst replaces elements of a tree.)

This, however, is the wrong strategy. Blindly replacing % everywhere in the expression will work for many cases, but will break in edge cases like (god help you) nested zapf forms. We could try to walk the code of the expression we get, but down this path lies madness and implementation-specificity.

The solution is much, much simpler. We can just use a normal let to capture % in the expression:

; Final version
(defmacro zapf (place expr)
  (multiple-value-bind
        (temps exprs stores store-expr access-expr)
      (get-setf-expansion place)
    `(let* (,@(mapcar #'list temps exprs)
            (,(car stores)
             (let ((% ,access-expr))
               ,expr)))
      ,store-expr)))

And now move-ball only requires the bare minimum of typing:

(defun move-ball (ball distance screen-width)
  (zapf (ball-x ball)
        (mod (+ distance %) screen-width)))

For completeness it would be nice to allow zapf to take any number of place/value pairs, just like setf . I’ll leave this as an exercise for you to try if you’re interested.

It took me a while to arrive at this form of the macro. I personally find it lovely and readable. If you disagree, that’s fine too! Lisp is a wonderful substrate for building a language that fits how you think and talk about problems. Play around with your own syntax and find out what feels most natural for you.

Before we leave, let’s just ponder one more thing. Performance is often ignored these days, but what if we care about making the computer go fast? The final zapf macro is handy and expressive, but what cost have we paid for this abstraction?

First let’s tell SBCL that we want to go fast and return to our original setf strategy:

(deftype nonnegative-fixnum ()
  `(integer 0 ,most-positive-fixnum))

(deftype positive-fixnum ()
  `(integer 1 ,most-positive-fixnum))

(defstruct ball
  (x 0 :type nonnegative-fixnum)
  (y 0 :type nonnegative-fixnum))

(declaim (ftype (function (ball fixnum positive-fixnum))
                move-ball))

(defun move-ball (ball distance screen-width)
  (declare (optimize (speed 3) (safety 0) (debug 0)))
  (setf (ball-x ball)
        (mod (+ distance (ball-x ball))
             screen-width)))

How does that turn out?

; disassembly for MOVE-BALL
; Size: 82 bytes. Origin: #x1005E5E12E
; 2E:       488B410D         MOV RAX, [RCX+13]       ; no-arg-parsing entry point
; 32:       48D1F8           SAR RAX, 1
; 35:       498D3C00         LEA RDI, [R8+RAX]
; 39:       488BDE           MOV RBX, RSI
; 3C:       48D1FB           SAR RBX, 1
; 3F:       4885DB           TEST RBX, RBX
; 42:       7431             JEQ L3
; 44:       488BC7           MOV RAX, RDI
; 47:       4899             CQO
; 49:       48F7FB           IDIV RAX, RBX
; 4C:       4C8BC0           MOV R8, RAX
; 4F:       488BC2           MOV RAX, RDX
; 52:       48D1E0           SHL RAX, 1
; 55:       4885C0           TEST RAX, RAX
; 58:       7510             JNE L2
; 5A: L0:   488BD8           MOV RBX, RAX
; 5D: L1:   4889590D         MOV [RCX+13], RBX
; 61:       488BD3           MOV RDX, RBX
; 64:       488BE5           MOV RSP, RBP
; 67:       F8               CLC
; 68:       5D               POP RBP
; 69:       C3               RET
; 6A: L2:   4885FF           TEST RDI, RDI
; 6D:       7DEB             JNL L0
; 6F:       488D1C30         LEA RBX, [RAX+RSI]
; 73:       EBE8             JMP L1
; 75: L3:   0F0B0A           BREAK 10                ; error trap
; 78:       07               BYTE #X07
; 79:       08               BYTE #X08               ; DIVISION-BY-ZERO-ERROR
; 7A:       FE9E03           BYTE #XFE, #X9E, #X03   ; RDI
; 7D:       FE9E01           BYTE #XFE, #X9E, #X01   ; RBX

Well that’s not too bad. I’m not sure why SBCL still performs a check for division by zero for the mod even though we said that screen-width can’t possibly be zero. But hey, we’re at a manageable amount of assembly, which is pretty nice for a high-level language!

Now for the zapf version:

(defun move-ball (ball distance screen-width)
  (declare (optimize (speed 3) (safety 0) (debug 0)))
  (zapf (ball-x ball)
        (mod (+ distance %) screen-width)))

Okay, let’s bite the bullet. How bad are we talking?

; disassembly for MOVE-BALL
; Size: 70 bytes. Origin: #x1005F47C7B
; 7B:       488B410D         MOV RAX, [RCX+13]       ; no-arg-parsing entry point
; 7F:       48D1F8           SAR RAX, 1
; 82:       488D3C03         LEA RDI, [RBX+RAX]
; 86:       4885F6           TEST RSI, RSI
; 89:       742B             JEQ L2
; 8B:       488BC7           MOV RAX, RDI
; 8E:       4899             CQO
; 90:       48F7FE           IDIV RAX, RSI
; 93:       488BD8           MOV RBX, RAX
; 96:       488D0412         LEA RAX, [RDX+RDX]
; 9A:       4885C0           TEST RAX, RAX
; 9D:       750D             JNE L1
; 9F: L0:   48D1E2           SHL RDX, 1
; A2:       4889510D         MOV [RCX+13], RDX
; A6:       488BE5           MOV RSP, RBP
; A9:       F8               CLC
; AA:       5D               POP RBP
; AB:       C3               RET
; AC: L1:   4885FF           TEST RDI, RDI
; AF:       7DEE             JNL L0
; B1:       4801F2           ADD RDX, RSI
; B4:       EBE9             JMP L0
; B6: L2:   0F0B0A           BREAK 10                ; error trap
; B9:       07               BYTE #X07
; BA:       08               BYTE #X08               ; DIVISION-BY-ZERO-ERROR
; BB:       FE9E03           BYTE #XFE, #X9E, #X03   ; RDI
; BE:       FE1E03           BYTE #XFE, #X1E, #X03   ; RSI

Wait, it’s actually shorter ? What about all those extra variables the zapf macro expands into? It turns out that SBCL is really quite good at optimizing let statements and local variables.

This is why I love Common Lisp. You can be rewriting the syntax of the language and working on a tower of abstraction one moment, and looking at X86 assembly to see what the hell the computer is actually doing the next. It’s wonderful.