12 Extending Languages

6.10.0.2

12 Extending Languages

Goals

— syntax-parse

12.1 Functions on Syntax

Racket’s compiler is programmable. The most basic way to change the compiler is to define a compile-time function. This is done with define-syntax, which looks like an ordinary function definition.

regular function		compile-time function
(define (f1 x) 5)		(define-syntax (f x) #'5)

Why #'5 instead of 5? Compile-time functions must generate code, and the hash-quote is one way to generate code. It is just a short-hand for syntax.

So how do we use f? Like any other function, except that it is run at compile time, not at run time.

(f 10)

Try this with plain 5 as the function body.

The other new idea is that f takes any number of arguments:

(f)

(f 10 "hello world")

Weird? Let’s print the input to see what’s going on:

; Syntax -> Syntax
; generate the same code no matter what the argument
(define-syntax (f-display stx)
(displayln stx)
#'5)

(f-display)
(f-display 10)
(f-display 10 "hello world")

Insight The argument of a compile-time function is the syntax tree labeled at the root with the name of the function.

Since Racket belongs to the Lisp family, there is a function that extracts the underlying list from the syntax tree. We can, for example, take it is length and translates a syntax tree into code that represents its length.

(define-syntax (g stx)
(define n (length (syntax->list stx)))
#`#,n)

(g)
(g a)
(g a b)

Why #`#,n? Well, hash-backquote also generates code. But, unlike hash-quote it allows unquoting. And you guessed it, hash-comma is unquote at the syntax level.

Syntax trees come with additional properties, and we can access those. For example, we can retrieve the line where the tree shows up or the column:

(define-syntax (i stx)
  (define l (syntax-line stx))
  (define c (syntax-column stx))
  #`(list "line and column info" #,l #,c))

(i 1)

And we can use Racket’s list function to traverse the tree and generate a new one:

(require (for-syntax racket/list))

(define-syntax (world stx)
  (define expr (syntax-e stx))
  (define iden (second expr))
  (define code (list 'define iden "hello world, how are you?"))
  (datum->syntax stx code))

(world hello)
hello

We use syntax-e to extract the Racket list representation of the syntax tree; extract the identifier, which we happen to know to be in second position; create a list that looks like a definition; and translate that list into code with datum->syntax.

12.2 syntax-parse

How do real functions really take apart their arguments? Pattern matching of course!

And syntax-parse is yet another embedded language for pattern matching in Racket. In contrast to match or Redex’s pattern matcher, it is tuned to help with syntax-processing functions.

(require (for-syntax syntax/parse))

(define-syntax (j stx)
(syntax-parse stx
((_ x) #'(define x "j"))))

(j x)

x

(j y)

y

Morally, j is the same function as world. But look how much easier it is to write it down.

And even better, imagine someone doesn’t use it properly:

(j 1)

Now we get an error about define, even though it isn’t even visible. The developer needs to know that j was used the wrong way.

We can add annotations to tell the pattern matcher that the second part of the syntax tree must be an identifier or id for short:

(define-syntax (k stx)
(syntax-parse stx
((_ x:id) #'(define x "k"))))

(k 1)

When a programmer uses k the wrong way, the error message explains the problem in terms of k not the broken code it generates due to bad syntax trees.

We can even express this as a test:

(require rackunit syntax/macro-testing)

(check-exn #rx"k: expected identifier"
(lambda () (convert-syntax-error (k 1))))

But let’s not get carried away.

Let’s build up an ML-style let-construct:

(local ((x e)) e) and (local ((x e) (x e)) e) are just let
(local ((x1 e1) and (x2 e2)) e) makes x1 and x2 mutually recursive
(local ((x1 e1) in (x2 e2)) e) scopes x1 for e2

We will do so one step at a time:

(define-syntax (local stx)
  (syntax-parse stx
    ((_ ((x1:id e1:expr)) e)
     #'(let ([x1 e1]) e))
    ((_ ((x1:id e1:expr) (x2:id e2:expr)) e)
     #'(let ([x1 e1][x2 e2]) e))))

Next we need #:literals because we want to match and and nothing else:

(define-syntax (local stx)
  (syntax-parse stx #:literals (and)
    ((_ ((x1:id e1:expr)) e)
     #'(let ([x1 e1]) e))
    ((_ ((x1:id e1:expr) (x2:id e2:expr)) e)
     #'(let ([x1 e1][x2 e2]) e))
    ((_ ((x1:id e1:expr) and (x2:id e2:expr)) e)
     #'(letrec ([x1 e1][x2 e2]) e))))

Here we reuse the binding of and from Racket because #:literals must use existing identifiers.

Now in doesn’t exist in Racket. So we make a definition:

(define-syntax (in stx)
  (raise-syntax-error 'in "used out of context" stx))

(define-syntax (local stx)
  (syntax-parse stx #:literals (and in)
    ((_ ((x1:id e1:expr)) e)
     #'(let ([x1 e1]) e))
    ((_ ((x1:id e1:expr) (x2:id e2:expr)) e)
     #'(let ([x1 e1][x2 e2]) e))
    ((_ ((x1:id e1:expr) and (x2:id e2:expr)) e)
     #'(letrec ([x1 e1][x2 e2]) e))
    ((_ ((x1:id e1:expr) in (x2:id e2:expr)) e)
     #'(let* ([x1 e1][x2 e2]) e))))

(local ((x 2) (y 2)) (* x y))

(local ((x 2) in (y x)) (* x y))

(local ((f (λ (x) (g x))) and (g (λ (y) (if (= y 1) 2 (f (- y 1)))))) (f 3))

Let’s develop the equivalent of and without using and, i.e., conjunction. A simple version of conjunction deals with two expressions:

(conjunction #true #false)

(conjunction #true #true)

Only the second expression evaluates to true. At first glance, you may wish to define conjunction as a function that performs the usual Boolean operation but take a look at this:

(conjunction #false (smt-solver problem-with-one-gzillion-variables))

We know that, even if this large-ish looking expression produces #true, the overall expression must return #false. Our implementation should short-cut the evaluation. We can do this with a compile-time function in a call-by-value language.

So here we go:

(define-syntax (conjunction stx)
(syntax-parse stx
[(_ lhs:expr rhs:expr) #'(if lhs rhs #false)]))

Alternatively,

(define-syntax (conjunction stx)
  (syntax-parse stx
    [(_ lhs:expr rhs:expr) #'(and-function lhs (lambda () rhs))]))

(define (and-function arg1 suspended-arg2)
  (if arg1 (suspended-arg2) #false))

Let’s see how conjunction could cope with multiple expressions:

(define-syntax (conjunction stx)
  (syntax-parse stx
    [(_ lhs:expr rhs:expr) #'(if lhs rhs #false)]
    [(_ lhs:expr rhs:expr rhs2:expr ...) #'(if lhs (and rhs rhs2 ...) #false)]))

← prev up next →

1	From the Lambda Calculus to Redex
2	Lab Playing with PCF-value
3	Modeling Functional Expression Languages
4	Lab Modeling PCF-value
5	Modeling Functional Languages
6	Lab The Mystery Languages of Records
7	Lab The Mystery Languages of Functions
8	Modeling Imperative-Functional Languages
9	Lab Modeling Event Loops
10	Lab The Mystery Languages of Variables
11	Shriram K. Semantics Re-engineering
12	Extending Languages
13	Lab Practice with Macros
14	Matthew F. Building Languages
15	Lab Practice with Hash Langs
16	Lab Testing Models, Testing Languages
17	Specification vs Implementation
18	Robby F. Advanced Testing