Compiling

$Revision: 5.0.2.9 $

The document introduction.htm provides an overview of the Allegro CL documentation with links to all major documents. The document index.htm is an index with pointers to every documented object (operators, variables, etc.) The revision number of this document is below the title. These documents may be revised from time to time between releases.

1.0 Lisp images without a compiler
2.0 Compile-file
3.0 No-in-package warning
4.0 Undefined function warning
5.0 EOF encountered error
6.0 Variables which control information printed by compile-file
7.0 File types
8.0 Declarations and optimizations

8.1 Examining the current settings and their effect
8.2 Declared fixnums example
8.3 Argument count checking example
8.4 Argument type for a specialized function example
8.5 Bound symbol example
8.6 Tail merging discussion
8.7 Changing compiler optimization switch settings
8.8 Switch can be t or nil meaning always on or always off

9.0 Pointers for choosing speed and safety values

9.1 Optimizing for fast floating-point operations
9.2 Supported operations
9.3 Help with declarations
9.4 Explain types and calls
9.5 Explain boxing
9.6 Explain variables

10.0 Other declarations and optimizations

10.1 Inline declarations ignored
10.2 Defstruct accessors
10.3 Stack consing, avoiding consing using apply, and stack allocation
10.4 Adding typep-transformers
10.5 Compiler handling of top-level forms

The Common Lisp standard requires that a Common Lisp system have a compiler and a loader. Since running interpreted code in any Common Lisp system is slow compared to running compiled code, most users will want to compile functions after trying them out in interpreted mode. A compiled function will run many times faster than its interpreted counterpart.

There are several implementation-dependent facets to a Common Lisp compiler. These include the naming of source and object files, how declarations are handled, and how floating-point and other optimizations are performed. These issues are discussed in this document.

1.0 Lisp images without a compiler

Building an image with build-lisp-image (see building_images.htm) with include-compilernil (or include-compiler true and discard-compiler also true) results in a compilerless Lisp image.

It is possible to load the compiler into a compilerless image. You do so by evaluating (require :compiler). However, it may be difficult or impossible to compile everything that would have been compiled had the compiler been present earlier (difficult or impossible because you do not know what the things are). Therefore, parts of Lisp will run slower than necessary after the compiler is loaded with no space saving (since you loaded the compiler). In short, if you intend to use the compiler, start with an image that includes it!

But note that you can build an image with the compiler included during the build and discarded after the build completes. This might be very useful when creating an application for distribution when you are not licensed to distribute an image with the compiler. If you specify :include-compiler t and :discard-compiler t in a call to build-lisp-image or generate-application (which accepts build-lisp-image arguments), the compiler will be present while the image is built and discarded at the end. Thus, things like applying advice (see advice.htm) can be done and compiled during the build.

If you use a compilerless Lisp, the system will (in the default) print a warning every time you or the system tries to call compile. For example, consider the following script from a compilerless Lisp image:

USER(1): (defun foo nil nil)
FOO
USER(2): (compile 'foo)
Warning: Compiler not loaded -- function left uncompiled
NIL
USER(3):

The text of the warning changes according to what the system was unable to compile. We emphasized that compile can be called by the system since some operations, including calling excl:advise under certain circumstances, calling the function ff:defforeign (but not calling the macro ff:def-foreign-call) in a fasl file, and doing non-standard method combinations in CLOS or MOP usage, do cause the compiler to be called and the warning to be printed. It is possible to suppress these warnings, either globally or locally. The variable excl:*compiler-not-available-warning* controls whether the warning is printed or not.

We do not recommend setting excl:*compiler-not-available-warning* to nil because interpreted code will run significantly more slowly than compiled and users should know when the compiler was called so they can expect degraded performance.

It is possible to suppress the warning locally. The warnings actually signal the condition excl:compiler-not-available-warning. Thus, it is possible to suppress these warning by wrapping the following code around sections likely to produce such unwanted warnings:

(handler-bind ((excl:compiler-not-available-warning
    #'(lambda (c) (declare (ignore c)) (muffle-warning))))
         <code-which--may-directly-or-indirectly-calls-compile>)

Note again that it is possible to build an image with the compiler included during the build and discarded at the end. Things processed during the build will be processed with the compiler but the final image will not have it (and thus suitable for delivery when the license does not permit distribution of images containing the compiler, such as standard runtime images, see runtime.htm). See the description of the include-compiler and discard-compiler arguments to build-lisp-image in building_images.htm.

2.0 Compile-file

The Common Lisp function compile-file is used to compile a file containing Common Lisp source code. We describe that function here to point out some Allegro-CL-specific arguments.

cl:compile-file

Arguments: input-filename &key output-file xref verbose print fasl-circle

This function returns three values. The first value is the truename of the output file, or nil if the file could not be created. The second value is nil if no compiler diagnostics were issued, and true otherwise. The third value is nil if no compiler diagnostics other than style warnings were issued. If the third returned value is non-nil, there were serious compiler diagnostics issued or the other conditions of type error or type warning were signalled during compilation.

This function will compile the Common Lisp forms in input-filename. input-filename must be a pathname, a string, or a stream (earlier versions of Allegro CL accepted a symbol as the filename argument but this is now explicitly forbidden by the ANSI standard).

input-filename is merged with cl:*default-pathname-defaults*. If input-filename name has no type, then compile-file uses the types in the list sys:*source-file-types*. The name of the output file may be specified by the output-file keyword argument. The default type of the output file is specified by the variable excl:*fasl-default-type*.

The output file will have a text header (readable, e.g., by the Unix command head) which provides information such as the time of creation and the user who created the file.

The xref keyword argument is a boolean that controls whether cross-referencing information is stored in the output file. The default value of the xref keyword argument is the value of excl:*record-xref-info*.

The verbose keyword argument is a boolean that controls printing of the filename that is currently being compiled. The default value of the verbose keyword argument is the value of cl:*compile-verbose*.

The print keyword argument is a boolean that controls whether compile-file prints the top-level form currently being compiled. The default value of the print keyword argument is the value of cl:*compile-print*.

The fasl-circle keyword argument is an Allegro CL extension (i.e. may not be supported in versions of Common Lisp other than Allegro CL). If the value of this argument is nil, there are two effects:

On the other hand, compilation speed is typically faster when this argument is nil. This argument defaults to the value of excl:*fasl-circle-default*.

The :cf top-level command invokes compile-file. :cf does not accept keyword arguments but does accept multiple input-filename arguments. :cf reads its arguments as strings, so you should not surround input-filenames with double quotes. Thus

:cf foo.cl

is equivalent to

(compile-file "foo.cl")

Certain versions of Allegro CL distributed as samples have compile-file disabled. In such versions, the function excl:load-compiled, whose arguments are similar to compile-file, will compile a file and load it but not write out a fasl (compiled) file.

3.0 No-in-package warning

If a file does not contain an in-package form, compile-file will compile the file (using the value of *package* as the package) and will also print a warning. For example, suppose the file test.cl contains the following:

;; file test.cl
(defun foo (x) (* 2 x))
(defun bar (y) (foo y))

Here is what happens when we compile this file:

USER(31): (compile-file "test.cl")
   ; --- Compiling file /net/rubix/usr/tech/dm/test.cl ---
   ; Compiling FOO
   ; Compiling BAR
   ; Writing fasl file "/net/rubix/usr/tech/dm/test.fasl"
  Warning: No IN-PACKAGE form seen in /net/rubix/usr/tech/dm/test.cl (Allegro
  Presto is ineffective when loading files that have no IN-PACKAGE form).
  ; Fasl write complete
USER(32):

The warning signaled is of type excl:compiler-no-in-package-warning. You can suppress the warning with handler-bind. For example:

(defun my-cf-no-warn (&rest args)
   (handler-bind ((excl:compiler-no-in-package-warning
      #'(lambda (c) (declare (ignore c)) (muffle-warning))))
           (apply #'compile-file args)))

4.0 Undefined function warning

If during the compilation of a file, the compiler notices a call to a function not defined (previously or in the file), it prints a warning that the function is called but not yet defined. (It is not an error to compile a function that calls a not-yet-defined function, of course. The warning is often useful, however, when the function name is typed incorrectly or the user believes a function is defined when it is not.) Here is an example:

;; The file foo.cl contains the following. Note the call to BAR.
(in-package :user)
(defun foo (x) (bar x))
 
;; Here is what happens when we compile the file:
USER(1): (compile-file "foo.cl")
  ; --- Compiling file /net/rubix/usr/tech/dm/foo.cl ---
  ; Compiling FOO
  ; Writing fasl file "/net/rubix/usr/tech/dm/foo.fasl"
  ; Fasl write complete
  Warning: The following undefined functions were referenced in the compilation:
   (BAR)
#p"/net/rubix/usr/tech/dm/foo.fasl"
NIL
NIL
USER(2):

The warning signaled is of type excl:compiler-undefined-functions-called-warning.

If multiple calls to compile-file are part of the body of a cl:with-compilation-unit form, no undefined function warning will be signaled when a function called in one file is defined in a later file.

5.0 EOF encountered error

If you try to compile a file that contains an incomplete form (because of, e.g., a missing closing parenthesis), compile-file signals an error with condition end-of-file. Consider the following file missing-paren.cl:

(defun foo nil nil)
(defun bar (a b) (+ a b)

The closing parenthesis is missing from the definition of bar. When Lisp tries to compile this file, it signals an error:

USER(3): :cf missing-paren.cl
  ; --- Compiling file /net/rubix/usr/tech/dm/missing-paren.cl ---
  ; Compiling FOO
  Error: eof encountered on stream"
  #<EXCL::CHARACTER-INPUT-FILE-STREAM
  #p"/net/rubix/usr/tech/dm/missing-paren.cl" pos 45 @ #xa99c12>
  starting at position 20.
  [condition type: END-OF-FILE]
 
Restart actions (select using :continue):
  0: retry the compilation of /net/rubix/usr/tech/dm/missing-paren.cl
  1: continue compiling /net/rubix/usr/tech/dm/missing-paren.cl but generate no
     output file
[1] USER(4):

Note the line:

starting at position 20.

That indicates that the incomplete form starts at position 20 in the file. Opening the file in Emacs and entering the command C-u 20 C-f (assuming standard keybindings, that means move forward 20 characters) should bring you to the beginning of the incomplete form.

6.0 Variables which control information printed by compile-file

The variables cl:*compile-print* and cl:*compile-verbose* provide the defaults for the print and verbose arguments to compile-file. Those arguments control how much information compile-file will print out. These variable are a standard part of common lisp.

7.0 File types

The default source-file type in Allegro CL is cl. The default compiled-file type in Allegro CL is fasl, which is a mnemonic for fast-loadable file. The default file types may be changed to suit your preferences.

The variable sys:*source-file-types* is a list of pathname types of Common Lisp source files. The initial value of this variable is the list

("cl" "lisp" "lsp" nil)

This means that if no file type is specified for the argument of compile-file (or the top-level command :cf), files with types cl, lisp, lsp, and no type will be looked for. For example

(compile-file "foo")

will cause the compiler to first look for the file foo.cl, then for foo.lisp, the foo.lsp, and finally foo. Users may want to change the value of sys:*source-file-types*. Some prefer not to allow files with no type to be looked at, since this prevents Lisp from trying to compile inappropriate files, or even a directory. Evaluating the form

(setq sys:*source-file-types* '("cl" "lisp" "lsp"))

will cause the compiler to look for files with file type cl, lisp and lsp. Then

(compile-file "foo")

will look for foo.cl, foo.lisp, and foo.lsp but not foo. The first file found will be the one compiled.

If you change sys:*source-file-types*, you may also wish to change sys:*load-search-list* and sys:*require-search-list* so that the functions load and require will look for files with the desired file types as well. See loading,htm above for a description of these variables.

When a file is compiled, a new file containing the compiled Lisp code is created. This file will have the type specified by the variable excl:*fasl-default-type*. The initial value of this variable is "fasl". You may change its value to any string you wish. If you change the value of this variable, you should also modify the load and require search lists so those functions will find the correct files. Throughout the documentation, files containing compiled Lisp code are called fasl files and are assumed to have a file type of fasl. You should understand that fasl really denotes the value of excl:*fasl-default-type*.

8.0 Declarations and optimizations

Compiling Lisp code involves many compromises. Achieving very high speed generally involves sacrificing graceful recovery from errors and even the detection of errors. The latter may cause serious problems such as wrong answers or errors that result from the propagation of the original undiscovered error. On the other hand, interpreted code is very easy to debug but may be too slow for practical applications. Fortunately, most program needs can be satisfied on some middle ground. The software development cycle generally begins with interpreted code and ends with well-optimized code. Progressing through this cycle, higher speed is achieved without significant loss of confidence because of the increase in the correctness and robustness of the Lisp code as development proceeds. (It is important to note that optimizations do not affect the behavior of correct code with expected inputs. However, it is not always easy to prove that the inputs will always be what they are expected to be or that a complex program is indeed `correct.') This section provides enough information so that a programmer may make intelligent decisions about performance compromises. 9.0 Pointers for choosing speed and safety values further discusses the issue.

Among specific trade-offs in compiling code are the verification of the number and types of arguments passed to functions, providing adequate information for debugging, and including code to detect interrupts. The Allegro CL compiler has been designed to be highly parameterized. Optimizations performed by the compiler may be controlled to a significant degree by the user. The compiler of course supports the primitive Common Lisp safety, space, speed, and debug optimization declarations. (Allegro CL accepts integer values between 0 and 3 inclusive, higher values representing greater importance to the corresponding aspect of code generation.) More significantly, Allegro CL provides a number of optimization switches that control specific aspects of code generation. These switches are in the form of variables that are bound to functions of the four optimization parameters safety, space, speed, and debug.

The initial values of the optimization qualities are set when Allegro CL is built, controled by the build-lisp-image arguments :opt-safety, :opt-space, :opt-speed, and :opt-debug (see building_images.htm#6.0 Arguments to build-lisp-image 2: defaults not inherited from the running image).

The default values for safety, space, and speed in an image created by build-lisp-image and in prebuilt images supplied in the Allegro CL distribution is 1. The default value for debug is 2.

You can set the values of the four qualities in various ways. One way is globally, with cl:proclaim as follows:

(proclaim '(optimize (safety n1)
                     (space n2)
                     (speed n3)
                     (debug n4)))

where n1, n2, n3, and n4 are integers from 0 to 3. The values may also be set with the top-level command :optimize. This command also displays the current values of the qualities (there is no portable way to access those values in Common Lisp).

8.1 Examining the current settings and their effect

The function excl:explain-compiler-settings (which takes no arguments and prints to *standard-output*) displays information about the current (global) values of the optimization qualities, including the value that will be returned by each compiler switch if called with the current settings of the optimization qualities.

The variables discussed in the remainder of this section specify what the various settings of safety, space, speed, and debug do. Their values are functions that return t or nil for the given settings of safety, space, speed, and debug. You may change the definitions by binding (or setting) the variables to new functions.

You can also set (or bind) a variable to t or nil and this will be interpreted as a function that always returns t or nil (respectively) meaning the associated switch is always on (t) or off (nil). Any new function used should accept four arguments. The system calls the function with the (lexically) current values of safety, space, speed and debug.

Following the table describing the switches, we give examples of erroneous code run interpreted (which produces the same error behavior as running with the switches at their safest setting) and run compiled with the switches at their least safe setting. What you will notice in the latter case is that erroneous code either results in a less informative error or perhaps in a wrong answer but no error. These examples are not intended to deter you from using compiler optimizations. Rather, we want to make you aware of the dangers of less safe code and show what error messages to expect when the switches are at their high speed settings. All switches are named by symbols in the compiler package.

Switch Description Initial value true when:
compile-format-strings-switch When true, a format string (the second required argument to format) is converted to a tree structure at compile time. If nil, the conversion is done at run time, resulting in slower printing but smaller code. True when speed is greater than space.
declared-fixnums-remain-fixnums-switch If true, compiler will generate code that assumes the sum and difference of declared fixnums are fixnums.

Warning: if this switch returns true during compilation but the sum or difference of two declared fixnums is not a fixnum, the compiled code will silently produce erroneous results.

See 8.2 Declared fixnums example below.

True only if speed is 3 and safety is 0.
generate-inline-call-tests-switch This switch is ignored in release 5.0.1 and later. It was used in earlier erleases to optimize certain flavors:send macro calls but that is now done differently. (Not applicable.)
generate-interrupt-checks-switch If true, code is added at the beginning of the code vector for and at the end of a loop in a compiled function that checks for an interrupt (Control-C on Unix, Break/Pause on Windows).

Note that an infinite loop that does not call functions will not be interruptable except by multiple Control-C's (Unix) and using the Franz icon on the system tray (Windows). See startup.htm for more information on interrupting when all else fails.

True only if speed is 3 and safety is 0.
internal-optimize-switch When true, the register-linking internal optimization is performed, resulting in faster but harder to debug code. True when speed is greater than 2 or debug less than 3.
optimize-fslot-value-switch When true, calls to ff:fslot-value-typed will be opencoded if possible. See fftype.htm True when speed is greater than safety.
peephole-optimize-switch When true, the compiler performs peephole optimization. Peephole optimizations include removing redundant instructions, such as a jump to the immediately following location. True when speed is greater than 0.
save-arglist-switch If true, argument lists for compiled functions and macros will be stored (and available to tools like excl:arglist). True when debug is greater than 0.
save-local-names-switch If true, the names of local variables will be saved in compiled code. This makes debugging easier. True when debug is greater than 1.
save-local-scopes-switch If true, information about when local variables are alive is saved in compiled code, making debugging more easy. True only when debug is 3.
tail-call-self-merge-switch If true, compiler will perform self-tail-merging (for functions which call themselves). See 8.6 Tail merging discussion below for more information on tail-merging. True if speed is greater than 0.
tail-call-non-self-merge-switch If true, compiler will perform non-self-tail-merging (for functions in the tail position different than the function being executed). See 8.6 Tail merging discussion below for more information on tail-merging. True if speed is greater than 1 and debug less than 2. (This is more restrictive than tail-call-self-merge-switch because references to the caller function are off the stack in this case but at least one call in on the stack in the self-merge case.)
trust-declarations-switch If true, the compiler will trust declarations in code (other than dynamic-extent declarations -- see next entry) and produce code (when it can) that is optimized given the declarations. These declarations typically specify the type of values of variables. If nil, declarations will be ignored – except (declare notinline) and (declare special) which are always complied with. True is speed is greater than safety.
trust-dynamic-extent-declarations-switch If true, the compiler will trust dynamic-extent declarations in code and produce code (when it can) that is optimized given the declarations. This switch is separated from the general trust-declarations-switch because when using multiprocessing, trusting dynamic-extent declarations may be unsafe when trusting other declarations is safe. True is speed is greater than safety.
verify-argument-count-switch If true, the compiler will add code that checks that the correct number of arguments are passed to a function. True if speed is less than 3 or safety is greater than 0.
verify-car-cdr-switch If true, code calling car and cdr will check that the argument is appropriate (i.e. a cons). The switch is only effective on platforms which have :verify-car-cdr on the cl:*features* list. Platforms lacking that feature ignore this switch since the verification is done differently but always. True if speed is less than 3 or safety is greater than 1.
verfiy-non-generic-switch If true, code is generated that an object of undeclared type is of the correct type when it appears as the argument to specialized functions like svref and rplaca. Thus the argument to svref must be a simple vector and if this switch is true, code will check that (and give a meaningful error message). See 8.4 Argument type for a specialized function for an example. Note that generic has nothing to do with (CLOS) generic functions. The name way predates CLOS. True for speed less than 3 or safety greater than 1.
verify-symbol-is-bound-switch If true, code will be added to ensure that a symbol is bound before the value of that symbol is used. The result (as shown in 8.5 Bound symbol example below) is an error with an appropriate message will be signaled in compiled with this switch true. In code compiled with this switch nil, behavior is undefined but it may be that no error is signaled. True if speed is less than 3 or safety is greater than 1.

8.2 Declared fixnums example

The following code illustrates the effect of this switch. We define a function that simply adds its two arguments and contains declarations that both the arguments are fixnums. When compiled with this switch returning nil, the function correctly returns the bignum resulting from the arguments in the example. When compiled with this switch returning t, the function returns a wrong answer.

USER(28): (defun frf (n m) (declare (fixnum n) (fixnum m))
              (+ n m))
FRF
USER(29): (setq bn (- most-positive-fixnum 20))
536870891 ;; 20 less than most-positive-fixnum
USER(30): (frf bn 50000)
536920891 ;; This value is a bignum
USER(31): (proclaim '(optimize (safety 0) (speed 3)))
T
USER(32): (compile 'frf)
FRF
NIL
NIL
USER(33): (frf bn 50000)
-268385477 ;; wrong answer.
USER(34):

8.3 Argument count checking example

Very serious errors can occur when this switch returns nil and the wrong number of arguments are passed. These errors are not necessarily repeatable. We give a simple example where you get a less than useful error message, but you should be aware that much more serious (possibly fatal) errors can result from code where the number of arguments are not checked. About 3 instructions (the exact number is platform dependent and ranges from 1 to 4) are added to a function call when this switch returns t.

USER(1): (defun foo (a b c) (+ a c))
FOO
USER(2): (foo 1 2)
  Error: FOO got 2 args, wanted at least 3.
  [condition type: PROGRAM-ERROR]
[1] USER(3): :pop
USER(4): (proclaim '(optimize (speed 3) (safety 0)))

T
USER(5): (compile 'foo)
  ; While compiling FOO:
  Warning: variable B is never used
FOO
T
T
USER(6): (foo 1)
#<unknown object of type number 4 @ #x8666c>
USER(7):

Note that no error is signaled. Note further that it is possible for a fatal garbage collection error to result from passing the wrong number of arguments.

8.4 Argument type for a specialized function example

The following examples show what happens when the switch returns t and when it returns nil.

USER(39): (defun foo (vec n) (svref vec n))
FOO
USER(40): (setq v (list 'a 'b 'c 'd 'e))
(A B C D E)
USER(41): (foo v 3)
  Error: Illegal vector object passed to svref: (A B C D E)
[1] USER(42): :pop
USER(43): (proclaim '(optimize (speed 3) (safety 0)))

T
USER(44): (compile 'foo)
FOO
NIL
NIL
USER(45): (foo v 3)
  Error: Received signal number 10 (Bus error)
  [condition type: SIMPLE-ERROR]
[1] USER(46):

8.5 Bound symbol example

The following example shows what happens if this switch returns nil. In that case, the symbol-value location is simply read. If the symbol is in fact unbound, an apparently valid but in fact bogus value may be obtained. In the example, that bogus value is passed to the function + and an error is signaled because it is not a valid argument to +. However, it may be that no error will be signaled and computation will continue, resulting in later confusing or uninterpretable errors or in invalid results.

USER(60): (defun foo (n) (+ n bar))
FOO
USER(61): (foo 1)
  Error: Attempt to take the value of the unbound variable `BAR'.
  [condition type: UNBOUND-VARIABLE]
[1] USER(62): :pop
USER(63): (proclaim '(optimize (speed 3) (safety 0)))

T
USER(64): (compile 'foo)
  ; While compiling FOO:
  Warning: Symbol BAR declared special
FOO
T
T
USER(65): (foo 1)
  Error: (NIL) is an illegal argument to +
  [condition type: TYPE-ERROR]
[1] USER(66):

8.6 Tail merging discussion

Consider the following function definition:

(defun foo (lis)
  (pprint lis)
  (list-length lis))

When you call foo with a list as an argument, the list is pretty printed and its length is returned. But note that by the time list-length is called, no more information about foo is needed but, in the interpreter at least, the call to foo remains on the stack. The compiler can tail merge foo in such a way that the call to list-length is changed to a jump. In that case, when list-length is reached the stack looks as if list-length was called directly and foo was never called at all. The side effects of calling foo, in this case, pretty printing the list passed as an argument, have all occurred by the time list-length is called.

Unwrapping the stack in this fashion is a benefit because it saves stack space and can (under the correct circumstances) avoid creating some stack frames all together. However, it can make debugging harder (because the stack backtrace printed by :zoom will not reflect the actual sequence of function calls, but see the discussion of ghost frames in debugging.htm) and it can skew profiling data (because, looking at our example, a sample taken after list-length is called will not charge time or space to foo because foo is off the stack).

The two switches comp:tail-call-self-merge-switch and comp:tail-call-non-self-merge-switch control tail merging.

8.7 Changing compiler optimization switch settings

The user may change the code which determines how any of these variables behaves by redefining the function which is the value of the variable. That function must take as arguments safety, space, speed, and debug. It is important that the function be compiled since it may be called many times during a compilation and an interpreted function will slow down the compiler. Here is the general form which modifies a switch variable. var identifies the switch variable you wish to change.

(setq var
     (compile nil '(lambda (safety space speed debug)
      
form-1
       ...
       form-n)))

where form-n returns nil or t as a function of safety, space, speed, and debug.

Note that we wrapped the function definition with compile. Note too that such a form will not affect the compilation of the file in which it is itself compiled (since the variable will not be redefined in time to affect the compile-file).

For example, if the following code is evaluated at the top-level or in an initialization file, the compiler will not save local scopes if speed is greater than 1 or if debug is less than 2 or if space is greater than 1. The value of safety has no effect on the switch.

(setq compiler:save-local-scopes-switch
  (compile nil '(lambda (safety space speed debug)
                  (declare (ignore debug))
                  (cond ((> speed 1) nil)
                        ((< debug 2) nil)
                        ((> space 1) nil)
                        (t t)))))

The initial values of safety and speed are both set during image build (with excl:build-lisp-image) using the opt-speed, opt-safety, opt-space, and opt-debug arguments. See building_images.htm.

8.8 Switch can be t or nil meaning always on or always off

The value of a compiler switch can be t or nil as well as a function. t is interpreted as a function that always return true and so causes the switch to always be on. nil is interpreted as a function that always returns false and so causes the switch to always be off. These settings are particularly useful when binding the variables during a specific compilation.

9.0 Pointers for choosing speed and safety values

What values should you choose for speed and safety? It is tempting to set speed to 3 and safety to 0 so that compiled code will run as fast as possible, and devil take the hindmost. Our experience is that people who do this say that Allegro CL is fast but lacks robustness, while people who use the more conservative default settings of 1 and 1 feel that Allegro CL is very robust but occasionally seems a bit sluggish. Which you prefer is for you to decide. The following points should be considered.

  1. If you are going to set speed globally to 3, we strongly discourage you from also globally setting safety to 0 rather than 1. There are only 3 differences between the two safety settings: at safety 0, (1) argument count checking is disabled; (2) interrupt checking is disabled; and (3) sums and differences of fixnums are assumed to be fixnums. Thus you may not be able cleanly to break out of an infinite loop; passing the wrong number of arguments may cause unrepeatable, possibly fatal errors; and adding fixnums whose sum is a bignum will silently produce the wrong answer. We recommend that those switches should only be set in an unsafe manner when compiling functions where it is very unlikely that the code is incorrect. You can set safety to 0 when compiling such functions by use of a declaration within the defun form.
  2. If strange errors occur whose cause cannot be discovered, recompile the code at a more safe setting of speed and safety and run it again. An error will usually occur (not necessarily in the same place), the error message should be more informative, and the debugger should have more information. If no error occurs, it may be that you have an incorrect declaration (e.g. declaring a value to be a single-float when it is in fact something else, like nil). Look particularly at initial values, making sure the initial value is of the declared type.
  3. You might consider resetting the switch variables to t or nil (as appropriate) as that can ensure the switch has a known, unambiguous value unaffected by declarations within a function definition or anything else.

9.1 Optimizing for fast floating-point operations

The compiler in Allegro CL has the ability to compile floating-point operations in-line. In order to take full advantage of this feature, the compiler must know what can be done in-line. For it to know this, the user must, through declarations, inform the compiler of the type of arguments to operations. The compiler will attempt to propagate this type information to other operations, but this is not as easy as it might seem on first glance. Because mathematical operations in Common Lisp are generic, that is they accept arguments of any type - fixed, real, complex - and produce results of the appropriate type, the compiler cannot assume results in cases when the user may think the situation is clear. For example, the user may know that only positive numbers are being passed to sqrt, but unless the compiler knows it too, it will not assume the result is real and not complex. The compiler can tell the user what it does know and what it is doing. See Help with declarations below. With this information, the user can add declarations as needed to speed up the compiled code. The process should be viewed as interactive, with the user tuning code in response to what the compiler says it knows.

9.2 Supported operations

The compiler will expand in-line (open-code) a floating-point numeric function only if the function is one which it knows how to open-code, and the types of the function's arguments and result are known at compile time to be those for which the open-coder is appropriate. Finally, at the point of compilation the compiler switch function which is the value of comp:trust-declarations-switch must return true. The default version of this function returns true if speed is greater than safety but users can change the values for which this switch returns true. explain-compiler-settings can be called to see what the value of that switch (and all others) will be given the current values of safety, space, speed, and debug.

The floating-point functions below are subject to opencoding. In each case, the function result must be declared or derivable to be either single-float or double-float. (Note that in this implementation, the type short-float is equivalent to single-float and long-float is equivalent to double-float.) The arguments to the arithmetic and trigonometric functions must be specifically one or the other of the two floating types, or signed integer constants representable in 32 bits, or else computed integer values of fixnum range. If these conditions are not met, the function will not be open-coded and instead a normal call to the generic version of the function will be compiled. Note that it is not sufficient to declare a value to be a float, it must be declared as either a single-float or a double-float.

We are often asked exactly which functions will opencode in Allegro CL. Unfortunately, it is not easy to provide an answer. Firstly, because the compiler is different on each different architecture, the answer is different in different implementations. Secondly, the same function may opencode in one case but not in another on the same machine because of the way it is affected by surrounding code. However, the following functions are candidates to be open-coded in most platforms or the in the platforms noted.

Note that the list is not exhaustive in ether direction. Not all the listed functions and operations opencode on all machines and functions and operations not listed do opencode on some machines. The :explain declaration described in the next section assists with determining what did and did not opencode.

9.3 Help with declarations

The compiler must be supplied with type declarations if it is to open-code certain common operations without type-checking. The compiler includes a type propagator that tries to derive the results of a function call (or special operator) from what it knows about its arguments (or subforms). The type propagator greatly reduces the amount of type declaration necessary to achieve opencoding. However, the programmer may need to examine the inferences made by the propagator to determine what additional declarations would increase code speed. Although supplying type declarations to the compiler is very simple, it is a task surprisingly prone to error. The usual error is that well-intentioned declarations are insufficient to tell the compiler everything it needs to know.

For example, the programmer might neglect to declare the seemingly obvious fact that the result of a certain sqrt of a double-float is also a double-float. However, sqrt in Common Lisp returns a complex if its argument is negative, so the compiler cannot assume a real result. The only impact of insufficient declarations is that some open-coders will not be invoked. However, it can be awkward for the user to determine whether or not any particular call was open-coded, and if not, why. trace and disassemble will provide the information, but these are clumsy tools for the purpose.

In order to provide better information about what the compiler is doing, a new declaration, :explain, has been added to Allegro CL.

[Declaration]

:explain

Arguments: [:calls | :nocalls] [:types | :notypes]
[:boxing | :noboxing] [:variables | :novariables]

This declaration instructs the compiler to report information about argument types and non-in-line calls, boxed floats and variables stored in registers. This declaration may be placed anywhere that a normal declaration may be, and can be proclaimed. The compiler will report information within the scope of the declaration. Note that results may differ from one platform to another. Again, the general form of the declaration is:

(declare (:explain [:calls | :nocalls]
                   [:types | :notypes]
                   [:variables | :novariables]
                   [:boxing | :noboxing]))

The arguments control four kinds of reporting the compiler will make during compilation. The arguments may appear in either order, and either may be omitted. By default, no information of either type is printed. The declaration causes the compiler to print information during compilation, and obeys normal declaration scoping rules. It has no effect on code generation.

When the :types declaration is in effect, the compiler will report for each function call it compiles (in-line or not) the types of each argument along with the result type. When the :calls declaration is in effect, the compiler will report when it generates code for any non-in-line function call. These two facilities operate independently.

When the :boxing declaration is in effect, the compiler will tell you when code is generated to box a number if it is possible for the code not to be boxed. A number is `boxed' when it is converted from its machine representation to the Lisp representation. For floats, the machine representation is one (for singles) or two (for doubles) words. Lisp adds an extra word, which contain a pointer and a type code. For fixnums, boxing simply involves a left shift of two bits. For bignums which are in the range of machine integers, boxing again adds an additional word.

When the :variables declaration is in effect, the compiler will report whether local variables (in function definitions) are being stored in registers or in memory. Since storing variables in registers results in faster code, the information printed when this variable is in effect may help in recasting function definitions to allow for more locals to be stored in registers.

The reports are printed during the code generation phase of compilation. Code generation occurs fairly late during compilation, after macroexpansion and other code transformations have taken place. The function calls explained by the compiler will therefore deviate in certain ways from the original user code. For example, a two-operand + operation is transformed into a call to the more efficient excl::+_2op function. In general, the user should easily be able to relate the code generator''s output to the original code.

The code generator works by doing a depth-first walk of the transformed code tree it receives from early compiler phases. In this tree walk, the :types printout happens during the descent into a branch of the tree. The :calls printout happens as the instructions are actually generated, during the ascending return from the branch.

The reason :explain is implemented as a declaration is so the user can gain fairly fine-grained control of its scope. This can be important when tuning large functions. The tool does require editing the source code to be compiled, but presumably the user is in the process of editing the code to add declarations anyway. These options may also be enabled and disabled globally by use of cl:proclaim, although they may produce a lot of output.

9.4 Explain types and calls

The example below shows simple use of the :explain declaration. The user is working with two-dimensional arrays of 8-bit data returned by an image scanner, and writes a function that computes the average data value in an array.

USER(4): (defun 2d-avg (ar)
	   (declare (optimize (speed 3) (safety 0))
		    (type (array (unsigned-byte 8) (* *)) ar))
	   (let ((sum 0))
	     (declare (fixnum sum))
	     (dotimes (x 64)
	       (dotimes (y 64)
		 (declare (:explain :calls :types))
		 (incf sum (aref ar x y))))
	     (/ sum 4096)))
2D-AVG
USER(5): (compile '2d-avg)
;Examining a call to >=_2OP with arguments:
;  symeval Y type (INTEGER 0 64)
;  constant 64 type (INTEGER 64 64)
; which returns a value of type T
;Examining a call to +_2OP with arguments:
;  symeval SUM type (INTEGER -536870912 536870911)
;  call to AREF type (INTEGER 0 255)
; which returns a value of type (INTEGER -536870912 536870911)
;Examining a call to AREF with arguments:
;  symeval AR type (ARRAY (INTEGER 0 255) (* *))
;  symeval X type (INTEGER 0 64)
;  symeval Y type (INTEGER 0 64)
; which returns a value of type (INTEGER 0 255)
;Generating a non-inline  call to AREF
;Examining a call to +_2OP with arguments:
;  symeval Y type (INTEGER 0 64)
;  constant 1 type (INTEGER 1 1)
; which returns a value of type (INTEGER 0 64)
2D-AVG
NIL
NIL
USER(6): 

Note a couple of points: (1) whenever the system prints that it is examining a call to a function, that means the function is a candidate for opencoding; (2) if opencoding does not succeed, a message is printed saying that the system is generating a non-in-line call to the function. Therefore, if no generating... message appears after an examining message, the function was inlined. In the example, there is only one function which was examined but not inlined: a call to aref.

So why didn't aref opencode? Recall from the list of candidates in the last section that aref of a simple array can opencode. Here, the argument to aref is declared to be a (general) array. We change that declaration and try again.

USER(6): (defun 2d-avg (ar)
	   (declare (optimize (speed 3) (safety 0))
		    (type (simple-array 
                           (unsigned-byte 8) (* *)) ar))
	   (let ((sum 0))
	     (declare (fixnum sum))
	     (dotimes (x 64)
	       (dotimes (y 64)
		 (declare (:explain :calls :types))
		 (incf sum (aref ar x y))))
	     (/ sum 4096)))
2D-AVG
USER(7): (compile '2d-avg)
;Examining a call to >=_2OP with arguments:
;  symeval Y type (INTEGER 0 64)
;  constant 64 type (INTEGER 64 64)
; which returns a value of type T
;Examining a call to +_2OP with arguments:
;  symeval SUM type (INTEGER -536870912 536870911)
;  call to AREF type (INTEGER 0 255)
; which returns a value of type (INTEGER -536870912 536870911)
;Examining a call to AREF with arguments:
;  symeval AR type (SIMPLE-ARRAY (INTEGER 0 255) (* *))
;  symeval X type (INTEGER 0 64)
;  symeval Y type (INTEGER 0 64)
; which returns a value of type (INTEGER 0 255)
;Examining a call to +_2OP with arguments:
;  symeval Y type (INTEGER 0 64)
;  constant 1 type (INTEGER 1 1)
; which returns a value of type (INTEGER 0 64)
2D-AVG
NIL
NIL
USER(8): 

This time, success.

Note, by the way, the declaration (optimize (speed 3) (safety 1)). The comp:trust-declarations-switch must be true (as it is when speed is greater than safety) for the :explain :types and :calls declaration to have effect.

9.5 Explain boxing

A number is boxed when it is converted from its machine representation to the Lisp representation. For floats, the machine representation is one (for singles) or two (for doubles) words. Lisp adds an extra word, which contain a pointer and a type code. For fixnums, boxing simply involves a left shift of two bits. For bignums which are in the range of machine integers, boxing again adds an additional word.

Boxing obviously involves a computational overhead, but more important it involves a space overhead. If a calculation involves the calculation of thousands of floats, for example, thousands of bytes of space will be used. Often that space need not be used. Let us consider a simple example. darray is a vector of double floats. The function foo-insert takes a vector, an index, and a double float as arguments, does something to the double float (adds 2.0d0 but it could be anything) and stores the result into the vector at the indicated index. Suppose things were defined as follows:

(setq darray (make-array 10 :element-type 'double-float
                         :initial-element 0.0d0))
 
(defun foo-insert (arr index value)
   (declare (type (simple-array double-float 1) arr)
            (fixnum index)
            (double-float value)
            (optimize (speed 3))
            (:explain :boxing))
   (setf (aref arr index)(+ value 2.0d0)))

When we compile foo-insert, we are warned that one double-float is boxed:

USER(16): (compile 'foo-insert)
;Generating a DOUBLE-FLOAT box
FOO-INSERT
NIL
NIL
USER(17):

Examining the code, we notice that: foo-insert returns a double-float (returned by the setf form). That value must be boxed before it can be returned. This can be fixed by adding nil at the end of the definition of foo-insert, so it returns nil instead:

(defun foo-insert (arr index value)
   (declare (type (simple-array double-float 1) arr)
            (fixnum index)
            (double-float value)
            (optimize (speed 3))
            (:explain :boxing))
    (setf (aref arr index)(+ value 2.0d0))
    nil)

We try compiling again and no boxing is reported:

USER(28): (compile 'foo-insert)
FOO-INSERT
NIL
NIL
USER(29):

9.6 Explain variables

Local variables in functions can be stored, during a function call, in memory or in registers. Storing in registers is faster, but there are only so many registers to go around. The compiler works out the live ranges of locals (the live range goes from when a local is first assigned a value until that value is last accessed) and then schedules register assignments. Unassigned locals are assigned to a memory location, so an access requires a fetch from memory and a store to memory.

To illustrate explaining variables, consider the following function, which has many arguments (which are treated as locals) and locals. The call to bar between the let bindings and the call to list ensures that the live range of all the variables lasts until the call to list.

USER(1): (defun foo (a b c d e f g)
           (declare (:explain :variables))
           (let ((h (+ a b))
                 (i (- a c))
                 (j (* c e))
                 (k (- d c))
                 (l f)
                 (m e)
                 (n g))
             (bar)
             (list a b c d e f g h i j k l m n)))

FOO
USER(2): (compile *)
Variables stored in permanent general registers:
(A D F G H B C E I)
Variables stored in memory:
(J K)
FOO
nil
nil
USER(3):

Of the 11 arguments and locals, 7 are stored in registers and 4 in memory. Note the platform affects the assignment because different processors have different number of registers (this example comes from compilation on an IBM RS/6000). Note that it is not recommended that you proclaim explanation of variables since reports will then be made on things compiled automatically (such as methods and foreign functions), causing unexpected messages to be printed to the terminal from time to time.

10.0 Other declarations and optimizations

There are other declarations which affect (or in some cases do not affect) the operation of the compiler, as we describe under the next several headings.

10.1 Inline declarations ignored

The inline declaration is ignored by the compiler. At appropriate settings of speed and safety, the compiler will inline whatever it can. Only predefined system functions can be inlined. User defined functions are never compiled inline. (The compiler will observe the notinline declaration, however, so you can suppress inlining of specific functions if you want.)

We have been asked why the inline declaration is ignored. It is not that we believe that inlining user functions does not provide any advantages, it is that we believe that there are other improvements that will provide more advantages. We had been anticipating that the ANSI CL spec would provide more guidance on how to implement inlining (by certifying the augment-environment function and some other stuff specified in CLtL-2, but ANSI chose in the end not to include that in the final spec), so now we intend, in some later release, to provide some user-function inlining capability.

Note that inlining is not a unmixed blessing. It increases the amount of space used by a function (since both the function definition and the block to stick in when the function is inlined have to be created and stored) and it makes debugging harder (by making the compiled code diverge from the source code). It is also prone to subtle errors (on the programmers side) and bugs (on our side).

10.2 Defstruct accessors

Defstruct accessors will be compiled inline under two conditions:

  1. The structure is not of type list. (Accessors for structures of type list are never inlined).
  2. The values of speed and safety are such that the compiler switch comp:verify-non-generic-switch returns nil. Using the default value, that switch will return nil when speed is 3 and safety is 0 or 1.

10.3 Stack consing,  avoiding consing using apply, and stack allocation

Stack consing

On some machines, stack consing of &rest arguments, closures, and some lists and vectors is supported if declared as dynamic-extent. Here is an example of such a declaration for an &rest argument. The function foo checks whether the first argument is identical to any later argument.

(defun foo (a &rest b)
   (declare (dynamic-extent b))
   (dolist (val b)
     (if* (eq a val) then (return t))))

Please note that care should exercised when using stack consing and multiprocessing, since a switch between one thread and another causes (part of) the stack to be saved, so the stack-consed objects and values are not available to the thread being switched into. For a binding or object that has dynamic extent, that extent is only valid when Lisp is executing on the thread that creates the binding or stack-conses the object. When (on Windows) a thread switch is executed, the extents of all dynamic-extent data and bindings are temporarily exited, to be re-established when another thread switch returns to the original thread. Since on Windows separate Lisp lightweight processes run on separate threads, it is important that dynamic-extent data (which may be stack-consed) not be referenced while executing on a different thread. On Unix, stack-groups were used rather than threads, and so a process-wait wait-function argument should never be declared dynamic-extent, since it was funcall'ed from other stack-groups. However, on Windows, wait functions are run only in their own threads, so stack consing in wait functions should work.

Dynamic-extent declarations are only observed at values of safety, space, speed, and debug for which comp:trust-dynamic-extent-declarations returns true.

Avoiding consing with apply using a &rest

In certain cases apply is now compiled more efficiently, to remove consing. Consider the following code:

(defun foo (x &rest y)
   (apply some-fun 1 x 2 y))

The &rest argument is used for nothing more than to supply a variable length argument list to apply.  This case is now compiled in such a way that

  1. The last argument to apply is not actually used, but an index to the n'th argument to foo is compiled instead.
  2. The &rest argument is considered dead and as if declared ignored.
  3. All aspects of the function are preserved (e.g. possible argument checking for a minimum but not a maximum number of arguments, etc.)

In this optimized case, the code works exactly as it did when the &rest argument was consed, but without the consing. Circumstances that will cause this optimization to not be used are if the &rest argument:

Optimization hint: If you have a function like

(defun wrap-it (flag &rest x &key &allow-other-keys)
   (when flag
     (setq x (list* :new-key 10 x)))
   (apply 'wrapped flag x))

then the optimization will not take effect.  If non-consed operation is desired, then the following modification will allow the optimization:

(defun wrap-it (flag &rest x &key &allow-other-keys)
   (if flag
       (apply 'wrapped flag :new-key 10 x)
     (apply 'wrapped flag x)))

Stack allocation

The following types of vectors can now be stack-allocated:

element type initializable? (see below)
t yes
(unsigned-byte 32) yes
(signed-byte 32) yes
(unsigned-byte 16) no
(signed-byte 16) no
(unsigned-byte 8) no
(signed-byte 8) no
character no
(unsigned-byte 4) no
bit no
excl::foreign yes
single-float no

To get a stack-allocated vector, the following coding practice should be used:

(declare (optimize <values that make comp:trust-declarations-switch true>)) 
  ;; with initial variable value, (speed 3) (safety 1) will work
... 
(let ((x (make-array n <options>))) 
  (declare (dynamic-extent x)) ...) 

where n is a constant integer, and options are limited to :element-type (and :initial-element, if the array is initializable according to the table above). All other forms might cause heap allocation of the array.

10.4 Adding typep-transformers

A typep-transformer allows the compiler to transform a form like

(typep x 'foo)

into a form like

(funcall some-function x)

For example, the Allegro CL compiler will already transform typep forms where the type is defined as:

(deftype foo () `(satisfies foop))

into

(funcall 'foop x)

The ability to add typep-transformers described here allows types defined with a more complicated syntax than (satisfies 'some-function) to be similarly transformed. The user must supply the appropriate predicate function and call the following function to associate the type with the predicate.

The function excl:add-typep-transformer takes a type and a predicate which will allow the compiler to transform forms like

(typep x 'type)

into the form:

(funcall <predicate> x)

The function excl:remove-typep-transformer removes the association between the type and the predicate.

For example, suppose we have defined a type called angle, which is a normalized angular measurement:

(deftype angle () '(real #.(* -2 pi) #.(* 2 pi))

Suppose further that we have a time critical function that takes two arguments, checks to be sure they are angles, adds them together, checks to make sure the result is an angle, and returns it:

(defun add-two-angles (a b)
   (declare (optimize (speed 3)))
   (unless (typep a 'angle)
           (error "~s is not an angle" a))
   (unless (typep b 'angle)
           (error "~s is not an angle" b))
   (let ((sum (+ (the angle a) (the angle b))))
        (unless (typep sum 'angle)
                (error "Sum (~s) of angles is not an angle" sum))
     sum))

As is, 10,000 calls of this function (given legal arguments) takes about 1.5 CPU seconds (on my test machine). Suppose we're unhappy with that and want to speed it up by adding a typep-transformer, without having to change the coding of add-two-angles. Further suppose that we're usually dealing with single precision floating point numbers. Here's a way we can do it. We first define our predicate function. Note that we put the most likely case (single-float) as the first choice in the typecase form.

(defun anglep (x)
   (declare (optimize (speed 3) (safety 0)))
   (typecase x
      (single-float (and (>= (the single-float x)
                          #.(float (* -2 pi) 0.0s0))
                         (<= (the single-float x)
                          #.(float (* 2 pi) 0.0s0))))
      (fixnum (and (>= (the fixnum x) #.(truncate (* -2 pi)))
                   (<= (the fixnum x) #.(truncate (* 2 pi)))))
      (double-float (and (>= (the double-float x)
                          #.(float (* -2 pi) 0.0d0))
                         (<= (the double-float x)
                          #.(float (* 2 pi) 0.0d0))))))

We then call excl:add-typep-transformer to make the compiler aware of our predicate:

(excl:add-typep-transformer 'angle 'anglep)

Now if we recompile add-two-angles and call it another 10,000 times with the same arguments, it only takes about .25 CPU seconds, a 6 fold improvement (you may see a different speedup ratio).

10.5 Compiler handling of top-level forms

See the discussion in implementation.htm and the description of excl:*cltl1-compile-file-toplevel-compatibility-p* for information on special handling of certain top-level forms in a file. The issue is whether the forms are treated (during the compile) as if wrapped in an

(eval-when (compile) ...)

A top-level form in a file is one which is not a subform of anything except perhaps progn. In CLtL-1 CL, top-level forms involving calls to the following functions were treated as if warpped in such an eval-when when compiled. In ANSI CL, they are not. You can arrange to have the CLtL-1 behavior as described in implementation.htm. The affected functions are:

make-package
shadow
shadowing-import
export
unexport
require
use-package
unuse-package
import

Copyright (C) 1998-1999, Franz Inc., Berkeley, CA. All Rights Reserved.