On the one hand, we’ve looked at modeling languages in Redex. On the other hand, we’ve started looking at implementing compile-time functions as a way of defining new pieces of a language. As we’ll see, you can use comple-time functions to define a whole new language within Racket. So, what’s the relationship between Redex models and compile-time functions?
Redex and compile-time functions reflect the two main, different ways to
implement a language in the realm of Racket. A Redex model gives you an
Whether an interepreter or a compiler is better depends on your goal. You may well want both; you want to take a model as an interpreter and compile programs to a call to your interpreter, which gives you some of the benefits of both, and we’ll see how to do that tomorrow morning.
Up to this point, we’ve written compile-time function, but we refine the terminology now to macro to reflect that we mean a particular kind of compile-time function.
Racket macros implement syntactic extensions, by which we mean that you have to learn specific rules for each macro that you might use in a way that’s qualitiatively different from having to learn the specific behavior of each library function that you might call. When you use a run-time function, you can know that the rest of the program will run independent of the function as long as you don’t reach the call. More importantly, you know how argument expressons in the function call will behave. With a macro, you don’t know whether your program will even compile if you don’t know anything about the macro (i.e., you may not have the option of running the rest of the program), and there are no subexpressions within the macro use that have a meaning independent of the macro.
We’ve seen examples all week of how you have to learn special rules for the syntactic forms provided by Redex. Hopefully, it has also been clear why learning and using those special rules is worthwhile to more succinctly express program models. If you’re defining a language, then the concern of having to specify a form’s interaction with the rest of the language is the point, anyway.
While both macros and the implementation of a conventional compiler use compile-time functions (i.e., the compiler, obviously runs at compile time), macros have the additional feature of being able to plug into different contexts and to cooperate with other, unknown language extensions. Along those lines, Racket macros offer a smooth path from simple syntactic abstractions to language extensions to whole language implementations.
To get a sense of why it’s possible to implement whole new languages with Racket macros, try running this program
#lang racket (require (for-syntax syntax/parse)) (define-syntax (lambda stx) (syntax-parse stx [(_ (x:id) e:expr) #'(cons 'x e)])) (lambda (x) 10)
This example illustrates that there are no identifiers in Racket that are special as keywords that cannot be redefined. Instead, seemingly core parts of the language, including lambda, can be defined as macros.
Stop! What happens if you add (define (f x) 11) and (f 10) to the program?
When we define lambda as above, then the original
lambda becomes inaccessible. Sometimes that’s fine, but if
the intent of a new lambda is to extend the existing
#lang racket (require (for-syntax syntax/parse) (only-in racket [lambda original-lambda])) (define-syntax (lambda stx) (syntax-parse stx [(_ (x:id) e:expr) #'(original-lambda (x) (printf "arg: ~s\n" x) e)])) (define f (lambda (x) 10)) (f 2)
Importing the original lambda as original-lambda allows the new implementation of lambda to use it, but it also allows the rest of the module to use original-lambda. If we want to write programs that only have access to the new lambda, the best organization is to put the implementation of lambda in a module separate from the program that uses it.
Since we may want to use the original lambda in many ways to implement a langauge, and since that language implementaton typically doesn’t doesn’t want to use the new form directly, we usually rename on provide instead of on require:
#lang racket (require (for-syntax syntax/parse)) (provide (rename-out [new-lambda lambda])) (define-syntax (new-lambda stx) (syntax-parse stx [(_ (x:id) e:expr) #'(lambda (x) (printf "arg: ~s\n" x) e)]))
Exercise 31. Add a match clause (or several) to the new-lambda macro so that lambda shapes (trees) other than
(lambda (x:id) e:expr)behave as before. Note If you know more than basic Racket but not the whole language, just get some shapes to work—
not all of lambda.
Exercise 32. Adjust "noisy-lambda.rkt" to make define create noisy functions, too, when it’s used in function-shorthand mode—
like (define (f x) 11), as opposed to (define x 8) or (define f (lambda (x) 11)).
Although "noisy-lambda.rkt" provides a lambda to shadow the one initially provided by the racket language, we rely on a client program to require it within a #lang racket without renaming the new lambda to something else and without requiring any other modules that provide a variant of lambda. To take control more reliably, we’d like a plain #lang line that gives the program the new lambda directly.
A language name after #lang is responsible not only for providing a set of identifier bindings, but also for declaring how to parse the rest of the characters after #lang, and "noisy-lambda.rkt" does not yet do that.
A language name after #lang has to be just alphanumeric characters plus _ and -. It cannot hash quote marks, like "noisy-lambda.rkt".
We can defer both of these constraints to an existing language
s-exp, which declares that the module content is
parsed using parentheses, and that looks for a module name to provide
initial bindings (using normal Racket string syntax) right after
The error is
module: no #%module-begin binding in the module’s language
We’ll need to tell you a little more to say why the error complains about #%module-begin, but the overall problem is that the module after s-exp is responsible for providing all bindings to be used in the module body, and not just things that differ from racket. Our example program needs, in addition to lambda, the define form, number constants, function application, and module-body sequencing. Let’s define "noisy-racket.rkt" to provide our new lambda plus all the non-lambda bindings of racket.
#lang racket (require (for-syntax syntax/parse)) (provide (rename-out [new-lambda lambda]) (except-out (all-from-out racket) lambda)) (define-syntax (new-lambda stx) (syntax-parse stx [(_ (x:id) e:expr) #'(lambda (x) (printf "arg: ~s\n" x) e)]))
Then we can use it as
Triggering syntactic extensions by name allows different extensions to be composed in a natural way, since each has its own trigger. Still, Racket has several forms where you don’t use a name. For example, 5 by itself normally treated as a literal number, instead of requiring the programmer to write (quote 5). Similarly, assuming that f has a variable binding, (f 1 2) is a function call without something before the f to say “this is a function call.” In many of these places, you might want to extend or customize a language, even though there’s no apparent identifier to bind.
To support extension and replacement. Racket macro expander treats several kinds of forms as having an implicit use of a particular identifier:
(f 1 2)
(#%app f 1 2)
Why does #lang correspond to two implicit names? Because the module one can’t be configured. The second one, #%module-begin, applies after the first one has imported the #%module-begin binding, so its meaning can be configured.
We couldn’t use "noisy-lambda.rkt" as a module-language module, because it doesn’t export #%module-begin. By exporting everything from racket except lambda, "noisy-racket.rkt" provides #%module-begin, #%app, and #%datum, all of which are used implicitly in "program.rkt".
Exercise 35. Racket’s #%app implements left-to-right evaluation of function-call arguments. Change "noise-racket.rkt" so that it implements right-to-left evaluation of arguments to a function call. You’ll need to use Racket’s #%app to implement your new #%app.
(define-syntax (macro-id stx) (syntax-parse stx [(_ pattern ....) #'template]))
is common enough that it would be nice to have a shorter way of writing it. Fortunately, we’re in a language that’s easy to extend with a shorthand like define-syntax-rule, which lets you write the above form equivalently as
(define-syntax-rule (macro-id pattern ....) template)
For historical reasons, the allowed pattern forms are restricted in that they cannot include identifiers that have : followed by a syntax-class name, as in x:id. Also, the error messages are worse, so define-syntax-rule is normally used only for temporary or internal extensions.
There’s also an intermediate point, which avoids writing an explicit lambda but allows multiple patterns:
(define-syntax macro-id (syntax-rules () [(_ pattern ....) template]))
Finally, you may see syntax-case, which is almost the same as syntax-parse, but it has the pattern-language restrictions of define-syntax-rule and syntax-rules. There’s little reason to use syntax-case over syntax-parse, other than the minor convenience of having it included in racket (again, for historical reasons).
#lang racket (define-syntax-rule (noisy-begin e ... last-e) (begin (printf "~s\n" e) ... (let ([result last-e]) (printf "~s\n" result) result))) (let ([result 1]) (noisy-begin result 2))
Racket’s macro system can infer binding structure for macros based on the way that macros ultimately expand. Specifically, the example macro above expands to let, and the expander knows the binding structure of let, so it can effectively infer a binding rule for example. But you know that the define-syntax-rule form is just a shorthand for a compile-time functions, which can do arbitrary things... mumble mumble halting problem mumble... so this inference is not as straightforward as, say, type inference. In fact, the inference works dynamically (at compile time). The details are beyond the scope (pun intended) of this summer school, but see these notes if you’re interested.
When you run a program in DrRacket, you get to interact with the
program after it runs. The interactive prompt is sometimes called the
top level, because you have access to all the bindings that
are at the outer scope of your module, while nested bindings are
inaccessible. Interactive evaluation is similar to adding additional
definitions and expression to the end of your program—
Since making interactive evaluation sensible with respect to a module’s content depends on the module’s language, a #%top-interaction form is implicitly used for each interaction. A replacement #%top-interaction might disallow definitions, or it might combine an expression’s processing with information (such as types) that is recorded from the module body.
#lang racket (define-syntax-rule (#%top-interaction . e) '("So, you want to evaluate..." e "?"))
14.8 #lang and Installed Languages
We mentioned in Controlling the Whole Language that the language named after #lang must have two properties: it must take responsibility for parsing the rest of the characters in the module, and it must be accessible by a name that doesn’t involve quote marks.
To make the module accessible without quote marks, then it needs to reside in a directory that is registered with Racket as a collection. More specifically, we normally register the directory as a package, and the default treatment of a package (unless the package says otherwise) is to use its directory as a collection.
You can also use a command line by cding to the parent of the "noisy" directory and runningraco pkg install noisy/Don’t omit the final /, which makes it a directory reference instead of a request to consult the package server.
Create a directory named "noisy" somewhere on your filesystem. (Make the name "noisy" so that it matches our examples.) Then choose Package Manager... from DrRacket’s File menu, click Browse... near the top left of the resulting window, answer Directory, and pick your "noisy" directory. Finally, click Install.
Now, create a "main.rkt" file in your "noisy". (The name "main.rkt" is special.) Put the content of "noisy-racket.rkt" in "main.rkt".
It still won’t work if you now try
because we’ve only addressed one of the problems—
(module reader syntax/module-reader noisy)
This declaration creates a reader submodule in the "main.rkt" module, and #lang noisy looks for a submodule by that name in the "main.rkt" module of the "noisy" collection.
This reader submodule is implemented using the language syntax/module-reader, which is a language specifically for making module parsers. The #%module-begin form of the syntax/module-reader module looks for a single identifier to be injected as the language of the parsed module; in this case, we use noisy to refer back to the "main.rkt" module of the "noisy" collection, which is back to the enclosing module.
will run and print 5. It happens that
would run and print the same way, just using the parser via s-exp instead of the reader submodule.
14.9 #lang and Parsing
If the point of creating and installing "noisy/main.rkt" is that we can use the short reference #lang noisy, then we’re done. If the point is to change parsing, then we need to override the default parser provided by syntax/module-reader.
A parser comes in two flavors: read-syntax and read. The read flavor is essentially legacy, but a parser submodule must provide it, anyway, even if just by using read-syntax and stripping away “syntax” information to get a “datum.” The read flavor takes an input stream, while the read-syntax flavor takes a source-file description (usually a path) plus an input stream.
Instead of writing a parser from scratch, which can be tedious, lets use the built-in read-syntax and just configure it to read decimal numbers as exact rationals instead of inexact floating-point numbers:
(module reader syntax/module-reader noisy #:read-syntax my-read-syntax #:read (lambda (in) (syntax->datum (my-read-syntax #f in))) (define (my-read-syntax src in) (parameterize ([read-decimal-as-inexact #f]) (read-syntax src in))))
With that change, then
will show an exact result instead of a floating-point approximation.
Exercise 37. Some users of #lang noisy may miss DOS-style comments using REM. Adjust the reader so that it detects and discards an REM result, discarding the rest of the line as well, and then tries reading again. Use syntax? to detect a non-EOF result from read-syntax, and use read-line to consume (the rest of) a line from an input stream.
If you want to construct languages, take a look at Matthew Butterick’s book on building Beautiful Racket.
Matthew Butterick and Alex Knauth constructed a "meta-language"—
Dan Feltey et al. describe how to re-create a mini version of Java, including an IDE in the Racket world
Vincent St-Amour et al. invent and implement a language for describing Lindemayer fractals, a paper with lots of amazing pictures, some code, and even less text
Leif Andersen et al. illustrate the language-oriented programming idea with a small, yet reasonably complex example involving eight embedded DSLs