3.9: Heap placement issues

When Allegro CL starts, it maps the image, data, and shared objects to memory. With larger images and applications, this mapping has become a concern for programmers. There is a new patch called the heap placement patch, which improves control over the heap. In this collection of FAQ entries, we describe that patch and the issues surrounding heap placement.

Q 3.9.1) Sometimes Allegro CL, particularly large images, run out of memory or fail totally with a bus error or a segv. Why might this be happening?
Q 3.9-2) How is heap placement determined and what can go wrong?
Q 3.9-3) What do I need to know about heap placement and how do I figure out the sizes and placements of my Lisp and C heaps, especially when I will be shipping my large application to my customers who have many various machine configurations?
Q 3.9-4) How does Lisp start up, in terms of shared-library linking and loading?
Q 3.9-5) What is the "Old Sun4" problem?
Q 3.9-6) I downloaded the "heap placement patch" on my Sparc, and now I get a big warning message whenever I start my Lisp. Why is this?
Q 3.9-7) What would happen if I ignore the warning message and do not rebuild the .dxl?
Q 3.9-8) How do I rebuild my images so they contain the heap-placement-patch?


Q 3.9.1) Sometimes Allegro CL, particularly large images, run out of memory or fail totally with a bus error or a segv. Why might this be happening?

A 3-9.1) There are problems with the memory mapping and locations of the Lisp and foreign (C) heaps in a running Lisp image. We refer to these problems collectively as the heap placement problem. While these problems are not in fact new, they are only triggered when the Lisp image is large (typically greater than 500 Mbytes). Only recently have such large images become common.


Q 3.9-2) How is heap placement determined and what can go wrong?

A 3.9-2) When Allegro CL starts up, space must be found for the following:

If you use 32 bit addressing (as Allegro CL does), there are potentially 4 Gigabytes of address space. However, most operating systems only allow 31 bit addresses, so only 2 Gigabytes are really available. Locations near address 0 (the bottom) are usually usually used by OS or kernel related things. Allegro CL therefore usually tries to start at some OS-dependent location, typically x20000000, so there is about 1.6 Gigabytes potentially above that. That is plenty for an application that uses less than, say, 0.5 Gigabytes, but as the application grows to, say, 1.5 Gigabytes, finding enough space becomes problematic.

A further complication arises from mapping of shared libraries, including the Allegro CL shared library which has extension .pll. Most shared libraries have a location where they prefer to be mapped. In some cases, problems arise if they cannot be mapped in that location.

Finally, Lisp has an idea (provided by the lisp-heap-size argument to build-lisp-image (see building_images.htm) of how much space it will need. When Lisp starts up, it tries to "get" that amount of space. We put "get" in quotation marks because it does not have a platform-independent meaning. On some (Sparcs, e.g.) you can reserve space but, like a reserve credit line, it is only taken when actually needed. More precisely, on such machines, Allegro CL reserves that amount of space. The Operating System then precludes other programs from using the reserved address space, but does not map the reserved space to actual swap space. When reserved space is mapped to swap space, the space is then committed. This is fine on the machines that support the distinction between reserved and committed space, but some machines (HP's running HP-UX, machines running Linux, and IBM machines running AIX) do not. On these machines the space you need is committed as soon as asked for, and so specifying a large lisp-heap-size on those machines can cause excessive swap space usage if the committed space is not actually needed.

So , what might go wrong when Allegro CL starts up? The following might be problems:

Even though this problem may affect any user of Allegro CL or an Allegro CL application, it is mostly a problem with developers of programs which will be distributed to a user base. Users who have an Allegro CL distribution and use a particular machine can, perhaps with trial an error (and perhaps with assistance from Franz Inc.), figure out how to build an image that will work on that machine. But VAR's, say, who are preparing a distribution requiring large images which will be sent to many customers, each of whom may have a different machine configuration and different programs running, may find it difficult to produce a single image suitable for all potential users on a particular platform.

Programmers can affect heap placements using these arguments to build-lisp-image (see building_images.htm):

Improvements in heap location management available with the heap placement patch, described below in Q 3.9-3 and more generally in the other FAQ items in this document, make the successful mapping of large images more likely, and make the providing of one image for all users on a particular platform easier to solve. But, any application that uses most of the available address space is in danger of running into machine constraints which make the application fail.


Q 3.9-3) What do I need to know about heap placement and how do I figure out the sizes and placements of my Lisp and C heaps, especially when I will be shipping my large application to my customers who have many various machine configurations?

A 3.9-3) We have developed a patch called the heap placement patch. This patch is incorporated in a new acl503.dll or libacl503.so or libacl503.sl, depending on platform, see introduction.htm for information on getting patches. This patch should fix several problems in Allegro CL 5.0:

  1. It maps heaps in a more appropriate order to increase the likelihood of successful Lisp startup.
  2. It fixes a bug manifested on Sparc and SGI where running out of swap space caused Lisp failure instead of a break prompt with a storage condition.
  3. It allows an artificially high value lisp-heap-size argument to build-lisp-image (see building_images.htm) to be specified, effectively telling Lisp to "take what it can get" each time.
  4. It fixes a bug which allowed specification of Lisp heaps at higher addresses than C heaps, which would cause GC errors if :lispstatic-reclaimable arrays were used (see the description of make-array in implementation.htm).
  5. It breaks out the "Old Sun4" problem, which is an inadequacy that some older Sun4s (not Sparcs) have in mapping memory at high addresses (see Q 3-9.x below). This allows large heaps on Sparcs with no GC errors, although such configurations cannot run on these older Sun4s.

Q 3.9-4) How does Lisp start up, in terms of shared-library linking and loading?

A 3.9-4) All discussions that follow in this entry assume that you have the heap placement patch (see Q 3.9-3 above), or have Allegro CL version 5.0.1.beta2 (the second beta release, not the first) level or later.

This is a complicated answer. We start with some terminology:

  1. The ACL shared library: This shared-library holds the base ACL system, and is sometimes known by the term "acldll". On Windows it is known as aclxxx.dll, and on UNIX it is called libaclxxx.ext where xxx is a version number and .ext is either .sl or .so.
  2. System libraries: shared-libraries that are pre-linked into either the ACL shared-library or the executable that loads the ACL library, or any shared-library on which a system library is dependent. This is a broad definition of system library, and can include any user-supplied library that has been linked into the executable.
  3. User loaded libraries: These are libraries that are loaded with the Common Lisp LOAD function, but not pre-linked as are system libraries.

The Startup Process:

  1. The operating system starts up the executable. Before the executable is able to start running, all system libraries must be loaded into memory and available. Various operating systems do this in various ways, using implementation dependent algorithms to place the libraries into memory. Also, whether all symbols are bound at the time of the library placement or whether the symbols are bound lazily (i.e., when needed only) is OS dependent; some systems may provide both.
  2. The executable begins to run. It may perform any operations it wants to do, including loading shared-libraries.
  3. The ACL shared-library is loaded and lisp_init is invoked. The operating system ensures that all shared-libraries that were linked into the ACL shared-library are loaded before lisp_init is allowed to start.
  4. lisp-init determines a heap (image) file, with extension typically .dxl. The heap file contains state information about the image, including whether a Pure Lisp Library will be used. The .dxl and .pll are loaded into memory in the following manner:
  5. The C heap is mapped in, and C variables are set. If a .pll file is to be used this fact becomes known at this time. The C heap can not be relocated from where it was first built.
  6. The .pll file, if present, is mapped in read-only. If it can't be mapped into the location it had been in a previous incarnation of Lisp, it is moved to another location. (Except when an image is built with build-lisp-image, there is always a previous incarnation of Lisp, perhaps the one that ran when the image was built.)
  7. The Lisp heap is mapped in. The best case is if the heap can be placed at the address and size that were specified by the build parameters. If those can't be satisfied, the current commit level is attempted (at the same address), followed by an operating-system-selected space of the built size, finally followed by the committed area size at an OS-selected location. If the lisp-heap-size happens to be larger than the available swap, then only as much as can be actually allocated is used, as long as it is at least as large as the commit requirement.
  8. Pointers are adjusted if the Lisp heap or the .pll file had to be moved. (The Lisp heap file might have to be moved because the Lisp heap start address is not available, e.g.. The .pll file might have to be moved if the address in the previous Lisp invocation is unavailable.) All pointers to the expected locations of the Lisp heap file and the pll file are moved to the new locations. This step is performed all at once, to ensure proper pointer movement.
  9. The Lisp starts. The process by which the Lisp starts is documented by the source file, <allegro directory>/src/aclstart.cl, which is provided in your distribution, and described in startup.htm. As one of the items in aclstart.cl, excl::reload-fix-entry-points (this function is not further documented) is called, which ensures that all system and user libraries are loaded if they are not already. This may involve performing a load on any libraries that were not loaded.

Now, there is a potential problem with the last step. If

then it is conceivable that there will be no swap left for the loading of the user libraries.

In this situation, it would be best for the executable to be programmed to pre-load the user libar(y/ies) so that the space is pre-allocated. (You do this by writing a customized main() that loads the needed libraries, see main.htm.) But this presents the possibility that the user library might break up the contiguous address space for the Lisp heap, especially on Windows. The problem in intractable in general, but solvable in individual cases.


Q 3.9-5) What is the "Old Sun4" problem?

A 3.9-5) There are some Sun4 machines which, although they can run late versions of Solaris 2.x, cannot mmap addresses in higher memory than approximately 0x10000000. (Franz Inc. owns such a machine.) It is problematic to build a Lisp image on one of these machines, because the Lisp and C heaps must be relatively small.

You can build a Lisp image that will run on all Sparcs running the appropriate levels of Solaris 2, but which will be limited in address space, if you build your Lisp image (using build-lisp-image, see building_images.htm, or generate-application, see delivery.htm) with the following parameters:

If you want to build your Lisp so that it can grow, the C heap must be higher; with the "heap placement patch" it defaults to #x54000000. But this will also preclude any use of the resulting images on old Sun4s. We have left you with the choice to make, instead of making it for you.


Q 3.9-6) I downloaded the "heap placement patch" on my Sparc, and now I get a big warning message whenever I start my Lisp. Why is this?

A 3.9-6) For GC reasons, the C heap must always be at a higher address than the Lisp heap. However (mostly due to the "Old Sun4 Problem" described in Q 3.9-5 above), Allegro CL 5.0 was shipped out on the Sparcs with a Lisp heap at a higher address than the C heap. We recently discovered this bug, and have corrected it for Allegro CL 5.0.1. However, all 5.0 .dxl files on Sparcs are likely to have the wrong ordering, and should be corrected.


Q 3.9-7) What would happen if I ignore the warning message and do not rebuild the .dxl?

A 3.9-7) If you have absolutely no "static-reclaimables" (made with make-array with :allocation :lispstatic-reclaimable, see implementation.htm, or with allocate-fobject with :foreign-static-gc allocation, or, on Windows using the old callocate or ccallocate), then no adverse effect will ever be seen. But if you have defined any static-reclaimables, then any lisp items in the statics will not be properly gc'd, and will turn to bad pointers.


Q 3.9-8) How do I rebuild my images so they contain the heap-placement-patch?

A 3.9-8) Simply use generate-application (see delivery.htm) or build-lisp-image (see building_images.htm) to rebuild a new .dxl file. The dxl file can be built by executing update.sh (on Unix machine) or update.bat (on Windows).

Please contact Franz Inc. if you have Allegro Composer on a Unix machine, but do not have either Allegro Enterprise or the profiler. (composer.dxl will not rebuild with what you have.)

If you have a .dxl created by a third-party vendor, ask them for a new .dxl or instructions on how to rebuild it.


© Copyright 1998, Franz Inc., Berkeley, CA.  All rights reserved.
$Revision: 1.1.2.2 $