5 Building a Language
Goals |
— |
— |
5.1 From Macros to Languages
Definition | = | (define-function (Variable Variable1 ...) Expression) | ||
Expression | = | (function-application Variable Expression ...) | ||
| | (if Expression Expression Expression) | |||
| | (+ Expression Expression) | |||
| | Variable | |||
| | Number | |||
| | String |
Program | = | Definition-or-Expression ... | ||
Definition-or-Expression | = | Definition | ||
| | Expression |
5.2 Modules and #lang
The #lang that starts a Racket-program file determines what the rest of the file means. Specifically, the identifier immediately after #lang selects an interpretation for the rest of the file. The only constraint on that interpretation is that it defines a Racket module that can be referenced using the file’s path.
Somehow, the characters of a module file get converted to a pile of
machine code with certain addresses designated as entry points to
implement functions defined by the module. Clearly, we don’t want to
think about the long road from characters to machine code every time
we write a Racket program or even a Racket language—
From our perspective as the implementer of algebra, the next layer down is a syntax-object representation of a module. The program
"example.rkt"
#lang algebra (define-function (f x) (+ x 1)) (function-application f 2)
will translate to roughly
(module example .... (#%module-begin (define-syntax (f stx) ....) ((lambda (x) (+ x 1)) 2)))
Before explaining more about that module form, there’s a difference in intent in the above two chunks of text showing programs. In the first case, the parentheses are meant as actual parenthesis characters that reside in a file. In the second case, the parentheses are just a way to write a text representation of the actual value, which is a syntax object that contains a lists of syntax objects that contain symbols, and so on. Someone has to parse the parentheses in the first block of code.
(module example "algebra.rkt" (#%module-begin (define-function (f x) (+ x 1)) (function-application f 2)))
Without creating a "algebra.rkt" file, copy the #lang s-exp "algebra.rkt" example into DrRacket and click the Macro Stepper button. The stepper will immediately error, since there’s no "algebra.rkt" module, but it will show you the parsed form.
5.3 The Core module Form
Module | = |
| |||||
| |
|
For a module that comes from a file, the name turns out to be ignored, because the file path acts as the actual module name. The key part is initial-import-module. The module named by initial-import-module gives meaning to some set of identifiers that can be used in the module body. There are absolutely no pre-defined identifiers for the body of a module. Even things like lambda or #%module-begin must be exported by initial-import-module if they are going to be used in the module body’s forms.
If require is provided by initial-import-module, then it can be used to pull in additional names for use by forms. If there’s no way to get at require, define, or other binding forms from the exports of initial-import-module, then nothing but the exports of initial-import-module will ever be available to the forms.
Since every module for has an explicit or implicit #%module-begin, initial-import-module had better provide #%module-begin. If a language should allow the same sort of definition-or-expression sequence as racket, then it can just re-export #%module-begin from racket. As we will see, there are some other implicit forms, all of which start with #%, and initial-import-module must provide those forms if they’re going to be triggered.
"simple.rkt"
#lang racket (provide #%module-begin)
(module whatever "simple.rkt" (#%module-begin))
5.4 Implicit Forms
Besides #%module-begin, there are four other implicit forms that will be relevant to #lang algebra: #%datum, #%app, #%top, and #%top-interaction.
5.4.1 #%datum
The #%datum form is implicitly wrapped around an literal value like 0, #true, or "apple" when it appears in a place where an expression is expected.
"arith.rkt"
#lang racket (require (for-syntax syntax/parse)) (provide #%module-begin (rename-out [number-datum #%datum]) +) (define-syntax (number-datum stx) (syntax-parse stx [(_ . v:number) #'(#%datum . v)] [(_ . other) (raise-syntax-error #f "not allowed" #'other)]))
5.4.2 #%app
The #%app form is implicitly added to parenthesized expression that appears in a place where an expression is expected and where the first item in the parentheses is not an identifier that is defined as a macro.
"arith.rkt"
#lang racket (require (for-syntax syntax/parse)) (provide #%module-begin (rename-out [number-datum #%datum] [plus +])) (define-syntax (number-datum stx) (syntax-parse stx [(_ . v:number) #'(#%datum . v)] [(_ . other) (raise-syntax-error #f "not allowed" #'other)])) (define-syntax (plus stx) (syntax-parse stx [(_ n1 n2) #'(+ n1 n2)]))
"arith.rkt"
#lang racket (require (for-syntax syntax/parse)) (provide #%module-begin (rename-out [number-datum #%datum] [plus +] [complain-app #%app])) (define-syntax (number-datum stx) (syntax-parse stx [(_ . v:number) #'(#%datum . v)] [(_ . other) (raise-syntax-error #f "not allowed" #'other)])) (define-syntax (plus stx) (syntax-parse stx [(_ n1 n2) #'(+ n1 n2)])) (define-syntax (complain-app stx) (define (complain msg src-stx) (raise-syntax-error 'parentheses msg src-stx)) (define without-app-stx (syntax-parse stx [(_ e ...) (syntax/loc stx (e ...))])) (syntax-parse stx [(_) (complain "empty parentheses are not allowed" without-app-stx)] [(_ n:number) (complain "extra parentheses are not allowed around numbers" #'n)] [(_ x:id _ ...) (complain "unknown operator" #'x)] [_ (complain "something is wrong here" without-app-stx)]))
5.4.3 #%top
(define-syntax (complain-top stx) (syntax-parse stx [(_ . x:id) (raise-syntax-error 'variable "unknown" #'x)]))
5.4.4 #%top-interaction
Finally, you may have noticed that when you run any of the working programs with "arith.rkt", DrRacket reports “Interactions disabled: language does not support a REPL (no #%top-interaction).”
#lang racket (require (for-syntax syntax/parse)) (provide #%module-begin (rename-out [number-datum #%datum] [plus +] [unwrap #%top-interaction])) .... (define-syntax (unwrap stx) (syntax-parse stx [(_ . e) #'e]))