3.1: Questions about the Base Lisp

#Q 3.1-1) I'm left with running Lisp processes after I exit my Emacs/xterm. What do I do to avoid this?
#Q 3.1-2) Why doesn't make-pathname merge the given :directory component with the directory component in :defaults argument?
#Q 3.1-3) I am getting stack overflows and occasional Lisp failure when I sort on large arrays. Why and what can I do?
#Q 3-1.4) I have set the stack cushion to a resonable value, but the soft stack limit is not being detected, and I get a lisp death instead. Why is that?


Q 3.1-1) I'm left with running Lisp processes after I exit my Emacs/xterm. What do I do to avoid this?

A 3.1-1) This issue is very complicated: whether and how lisp should terminate when its input/output streams are broken. The current implementation should give the behavior most people want, that a lisp image quietly and immediately ceases execution when its remote initial terminal io stream is closed.

If it doesn't, here is some code you can load into an image or otherwise cause to execute (e.g. in ~.clinit.cl) that might have useful effect in making lisp images go away when you want them to.

#-(version>= 4 3)
(progn
  (unless (fboundp 'unix-signal)
    (ff:defforeign 'unix-signal :entry-point (ff:convert-to-lang "signal")))
  (unix-signal 1 0)                              ;SIGINT
  (unix-signal 15 0)                             ;SIGTERM
)

Q 3.1-2) Why doesn't make-pathname merge the given :directory component with the directory component in :defaults argument?

A 3.1-2) Section 19.4.4 of the ANSI spec says:

After the components supplied explicitly by host, device, directory, name, type, and version are filled in, the merging rules used by merge-pathnames are used to fill in any unsupplied components from the defaults supplied by defaults.

unsupplied is the crucial word here. By specifying a :directory argument you have supplied the directory component, and the directory component of the :defaults argument is not used. Even specifying :directory nil explicit supplies a directory component of nil, and this will be treated differently from unsupplied.


Q 3.1.3) I am getting stack overflows and occasional Lisp failure when I sort on large arrays. Why and what can I do?

Here is a transcript showing a stack overflow. Note that the array has one million (10^6) elements.

USER(1): (setq pippo (make-array 1000000 :initial-element 0)) 
#(0 0 0 0 0 0 0 0 0 0 ...)
USER(2): (sort pippo #'<)
Error: Stack overflow (signal 1000)
[condition type: SYNCHRONOUS-OPERATING-SYSTEM-SIGNAL]

Restart actions (select using :continue):
 0: continue computation
 1: Return to Top Level (an "abort" restart)
[1c] USER(3): :pop
=================^^^^
USER(4): (sort pippo #'<)
#(0 0 0 0 0 0 0 0 0 0 ...)
USER(5): 

Here I continue the computation and Lisp exits with a segmentation violation:

USER(1): (setq pippo (make-array 1000000 :initial-element 0))
#(0 0 0 0 0 0 0 0 0 0 ...)
USER(2): (sort pippo #'<)
Error: Stack overflow (signal 1000)
 [condition type: SYNCHRONOUS-OPERATING-SYSTEM-SIGNAL]

 Restart actions (select using :continue):
 0: continue computation
 1: Return to Top Level (an "abort" restart)
[1c] USER(3): (sort pippo #'<)
Segmentation fault (core dumped)
%

A 3.1-3) The stack overflow occurs because a large array is being stack-allocated to perform the sort. The size of the array is architecture dependent; Windows platforms only allocate up to 4 Kbyte arrays on the stack, and normally heap allocate any larger arrays needed, while Unix platforms attempt to allocate 4 Mbyte arrays on the stack. On any architecture, the strategy is programmable; as described below.

When the above error occurs, there are several things that can be done.

  1. Instead of popping out of the break loop as in the example above, just continue. The stack overflow automatically reduces the stack cushion (see documentation for sys:stack-cushion and sys:set-stack-cushion), so continuing should allow further execution.
  2. On Unix platforms only, a csh can be run and the limit command used to set the stack limit to something larger than it currently is. We recommend at least 8192 Kbytes (8 megabytes), but if that is not enough, more can be allocated.
  3. Change the sort strategy (documented below). The Allegro CL sort function tries to allocate a temprary array on the stack if possible, so that it does not need to do so on the heap. If this strategy is not acceptable or convenient, change the strategy to either allocate from the heap or to use a pre-existing user supplied array.

Just continuing usually works as does, usually, clearing stack with a :reset and retrying. Note, as the second example above shows, trying to redo the sort command in the error prompt (that is, without clearing the error) can result in an abnormal exit from the lisp (Segmentation fault (core dumped) ).

This is an unfortunate hole in our stack-overflow detection strategy; Stack overflow is normally detected for every function call, and enough "slop" is allowed for so that functions that allocate an average amount of stack will not cause a hard stack overflow. But if the function allocates large stack objects (such as large temporary vectors) then the jump in stack usage is too much to detect by either the stack cushion or the hardware overflow detection, and stack-overflow death occurs. We hope to guard against such overflow death in some future version of Allegro CL.

Sort Strategy: 

You can tell the system whether to try to stack-allocate things to be sorted. From the documentation in the source code:

;; excl::*simple-vector-sort-strategy*:
;;
;; The sort strategy can be one of three types:
;; :stack - try to allocate stack space for the temp sort; this
;; works easily for 1k elements (4 kbytes), and (on
;; Unix platforms only) for up to 1m elements (4 mbytes)
;; if there is enough stack allocated by the os; more
;; than 1 m elements cause a new svector to be allocated.
;; :alloc - Allocate an svector of size equal to the vector to sort.
;; a new one is allocated each time.
;; <vector> - must be a simple-vector of type t of at least as many
;; elements as are being sorted. During the sort, the global
;; is reset to :alloc so that sort is re-entrant.

(defvar excl::*simple-vector-sort-strategy* :stack)

Q 3-1.4) I have set the stack cushion (see sys:set-stack-cushion and sys:stack-cushion) to a resonable value, but the soft stack limit is not being detected, and I get a lisp death instead. Why is that?

A 3-1.4) The stack-cushion is detected in "symbol trampoline", a short piece of code that is used when one Lisp function calls another. It is meant to flag normal situations where stack is growing to quickly, and to signal a condition before a hard stack-size limit is reached.

There are several possible situations where the stack-overflow is not detected by this mechanism, and careful thought must be given as to how to handle it:

  1. A lisp function may allocate a very large stack size, due to either a large number of variables or due to large stack-allocated arrays or lists. If the amount that the function allocates is larger than the difference between the hard stack limit and the soft stack limit set up by the stack cushion, then there will be no chance for the Lisp to signal the condition before the hard limit is reached. The only way to work around this problem is to be sure that there is sufficient stack-cushion for the worst-case function to allocate its needed stack.
  2. A Lisp function might call itself recursively, which on some architectures generates a fast call to location 0 of the same function. The fast call causes the symbol trampoline to be bypassed, thus causing the stack overflow detection to also be bypassed. The workaround is to declare the function calling itself as notinline within its own body. This will result in slightly slower code generation, but overflows would then be detected. Example:
(defun call-me ( ... ) 
  (declare (notinline call-me)) 
  ... 
  (call-me ...) ... ) 
  1. A non-lisp thread may be called, at which time there is no way to limit the stack on some machines. There is no workaround for this problem, other than to reduce ones dependence on non-lisp code.

© Copyright 1998, Franz Inc., Berkeley, CA.  All rights reserved.
$Revision: 1.1.2.10 $