Pathnames

$Revision: 5.0.2.6 $

The document introduction.htm provides an overview of the Allegro CL documentation with links to all major documents. The document index.htm is an index with pointers to every documented object (operators, variables, etc.) The revision number of this document is below the title. These documents may be revised from time to time between releases.

1.0 Unix symbolic links and truenames
2.0 Windows devices
3.0 Parsing Unix pathnames

3.1 Preprocessing
3.2 Determining the directory component
3.3 Determining the name component
3.4 Determining the  type component
3.5 Anomalies
3.6 Table of examples

4.0 The directory component of merged pathnames
5.0 Parsing Windows pathnames
6.0 Logical pathnames

6.1 Logical pathnames: introduction
6.2 Logical pathnames: general implementation details
6.3 Logical pathanmes: some points to note
6.4 Details of cl:load-logical-pathname-translations

Common Lisp pathnames do not always map easily into operating-system filenames. In this document we describe the mapping chosen for Allegro CL on the Unix and Windows operating systems and discuss the implementation of logical pathnames.

1.0 Unix symbolic links and truenames

Symbolic links are a feature of Unix filesystems. A symbolic link is a Unix file that is interpreted as a filename in a different location. Since a symbolic link can contain arbitrary filenames, a symbolic link can traverse an arbitrary number of hierarchical levels, can create back reference loops, and can make directories on other filesystems appear local. As implemented, the Common Lisp function cl:truename does not resolve symbolic links (as it does in some implementations). Instead, the function excl:pathname-resolve-symbolic-links can be used for that purpose. It takes a pathname argument and returns a pathname with all symbolic links resolved. Note that is does not handle circular links. The function excl:symbolic-link-p also returns a pathname with all symbolic links removed but, in contrast to excl:pathname-resolve-symbolic-links, returns nil if its argument is not a symbolic link.

2.0 Windows devices

The drive (usually a letter followed by a colon e.g. c:) in a Windows pathname is the device component of the pathname. It is not part of the directory component. Therefore, this will fail:

(defvar *root* #+mswindows "d:/foo/bar/"
               #-mswindows "/usr/foo/")
(make-pathname :directory *root* ...)

You can either specify "d:" as the value of the :device argument to cl:make-pathname or use the cl:pathname function to convert the string and add additional components using cl:merge-pathnames:

(defvar *root* (pathname #+mswindows "d:/foo/bar/"
                         #-mswindows "/usr/foo/")
(merge-pathnames ... *root*)

3.0 Parsing Unix pathnames

Common Lisp pathnames have six components:

  1. :host
  2. :device
  3. :version
  4. :directory
  5. :name
  6. :type

On Unix systems, the :host, :device, and :version components are ignored. Only the other three have meaning. In this section, we will describe how to transform a Unix pathname into an Allegro CL pathname object. There are four steps:

3.1 Preprocessing
3.2 Determining the directory component
3.3 Determining the name component
3.4 Determining the type component

We then have several paragraphs describing unusual cases and labeled:

3.5 Anomalies

Finally, there are some examples, labeled

3.6 Table of examples

3.1 Preprocessing

The tilde (~) character is used to denote the user's home directory. If Allegro CL encounters a tilde as the first character of a pathname string, Allegro CL converts it to the absolute pathname of the home directory of the user whose name follows, or, if a slash (/') follows, the home directory of the user running Lisp. Further, double slashes (`//') are converted to single slashes (`/') and `/./' is also converted to a single slash (`/'). At this point, the pathname string will have the form of the following schematic:

[/][<dir1>/]...[<dirn>/][<name>][.<type>]
(1)( ..... 2 ..........)(. 3 ..)(.. 4 ..)

The brackets (`[ ]') indicate that the elements may or may not appear. The contents of the angle brackets (`< >') describe what type of object goes in a particular location. The suspension points (`...') indicate that any number of objects of the specified type may appear.

3.2 Determining the :directory component

The directory component (a list in Allegro CL) is determined by parts (1), (2), and (3) of the schematic. Here are the rules:

  1. If (1) is present, the first element of the list that is the :directory component is :absolute. If (1) is not present and (2) is not empty, the first element is :relative. If (1) and (2) are both empty, the :directory component is nil unless (3) is `..' and (4) is empty. In the latter case (where the whole pathname is `..'), the directory component is :up. We now have the first element of the list that is the :directory component.
  2. If (1) is present and (2) is empty, the entire list is

(:absolute :root)

  1. If (2) is not empty, each `<diri>' is made into a string and added to the list unless it is two dots (`..'), in which case the keyword :up is added to the list. We have now resolved both (1) and (2).
  2. (3) affects the :directory component only if it is `..' and the type, (4), is empty. In that case, the keyword :up is added to the end of the list. If (3) is anything other than `..' or if (4) is not empty, its value does not affect the :directory component.
  3. Finally, if :up appears anywhere in the list following a string, the :up and the string are removed. For example

(:absolute "foo" :up "bar")

is resolved to

(:absolute "bar")

We have now determined the directory component. See below for examples.

3.3 Determining the :name component

The nil component is determined from (3) in the schematic. Whatever appears as `<name>' is converted to a string to become the name component unless the type, (4), is empty and `<name>' is all dots (`.', `..', `...', etc.). In that case, a single dot (`.') means the nil component will be nil. Two dots (`..'), as mentioned above, cause the keywords :up to be added to the list which is the value of the :directory component. Three or more dots are put in a string which becomes the value of the :name component.

3.4 Determining the :type component

The type, (4), must start with a dot, cannot contain another dot, and must contain at least one character other than the dot. In that case, everything after the dot (but not the dot itself) is made into a string and it becomes the value of the :type component.

3.5 Anomalies

There are anomalies because dots play so many roles in Unix pathnames. We have already discussed most of these. The remaining two are illustrated by the following cases:

.bar
bar.

The .bar looks like a type, as if the file has no name and type bar. Instead, .bar is taken to be the filename (including the dot) since the use of the dot in this case is to hide the file from the standard Unix ls command listing, not to specify a type.

In the case of bar., the nil is "bar" and the :type is the empty string (not nil).

3.6 Table of examples

That completes the rules for converting pathnames in Allegro CL. Table 1 just below has many examples of pathnames including ones with dots in all locations. Following each example, we indicate which rules were used in producing the result. There are 5 Directory (D) rules. Rules for name (N) and type (T) are not subdivided. Anomalies are shown as `A'.

Table 1: Examples of converting Namestrings to Pathnames

Namestring

Pathname components

 

Directory

Name

Type

Rules

"/" (:absolute :root) nil nil D 1,2
"/foo" (:absolute :root) "foo" nil D 1, N
"/foo." (:absolute :root) "foo" "" D 1, N, T
"/foo.b" (:absolute :root) "foo" "b" D 1, N, T
"/foo.bar." (:absolute :root) "foo.bar" "" D 1, N, T
"/foo.bar.baz" (:absolute :root) "foo.bar" "baz" D 1, N, T
"/foo/bar" (:absolute "foo") "bar" nil D 1,3, N
"/foo..bar" (:absolute :root) "foo." "bar" D 1, N, T
"foo.bar" nil "foo" "bar" D 1, N, T
"foo/" (:relative "foo") nil nil D 1,3
"foo/bar" (:relative "foo") "bar" nil D 1,3, N
"foo/bar/baz" (:relative "foo" "bar") "baz" nil D 1,3, N
"foo/bar/" (:relative "foo" "bar") nil nil D 1,3
"foo/bar/.." (:relative "foo") nil nil D 1,3,4,5
"/foo/../" (:absolute :root) nil nil D 1,3
".lisprc" nil ".lisprc" nil N, A
"x.lisprc" nil "x" "lisprc" N, T
"." (:relative) nil nil N
".." (:relative :up) nil nil N
"..." nil "..." nil N

4.0 The directory component of merged pathnames

Merging of pathnames is handled by Allegro CL to take advantage of directory hierarchies. Allegro CL follows the Common Lisp standard for merging pathnames. This section provides examples showing the directory component of the resulting pathname.

Given two pathnames a and b, then the result, c, of merging these pathnames may cause merging of their directory components.

(setf c (merge-pathnames a b))

This merging follows these rules:

  1. If pathname a does not have a directory component, then the directory component of pathname b becomes the directory component of the result c.
  2. If pathname a's directory component is absolute (i.e. it begins with :absolute) then pathname c will have pathname a's directory component.
  3. If pathname a has a directory component that is relative, (that is begins with :relative), then the directory component of pathname c depends on the directory component of pathname b. If pathname b has a relative directory component, then c's directory component will be the same as a's. If b's directory component is absolute, then the directory component list of a with the element :relative removed is appended to the directory component list of b. Then, the combined list is compressed by the removal of :up and :root entries if possible. For example if pathname b's directory component is

(:absolute "foo")

and pathname a's directory component list is

(:relative "bar")

then pathname c's directory component list is

(:absolute "foo" "bar")

  1. Finally, if pathname b does not have a directory component, the directory component of pathname a becomes c's directory component.

5.0 Parsing Windows pathnames

We have tried to make the handling of Windows pathnames (really DOS pathnames) as consistent as possible with the handling of UNIX pathnames. Note the following differences:

"foo\\bar.cl"
"foo/bar.cl"

(make-pathname :directory "c:\\foo\\" ...)

will fail. This will work:

(make-pathname :device "c:" :directory "\\foo\\" ...)

(pathname "\\\\hobart\\cl\\src\\acl.mak")

6.0 Logical pathnames

The logical pathname facility was added to the Common Lisp standard by the X3J13 committee. The logical pathname specification leaves various details of behavior up to the implementation. Here we describe these details of the Allegro CL implementation. We do not describe the whole facility.

6.1 Logical pathnames: introduction

Logical pathnames were added to the Common Lisp language by X3J13 as a facility for the portable specification of file names comprising an application. This section describes various implementation specifics about logical pathnames on Allegro CL, and then gives suggestions how logical pathnames can be used effectively in both developing and delivering a complex lisp application with many files. A final section discusses issues raised by Unix symbolic links, first in general, and then with regard to logical pathnames.

6.2 Logical pathnames: general implementation details

A pathname or logical pathname is a Lisp object, but the language defines conversions of a pathname to and from a character string representation, called a namestring or logical-pathname namestring. The mapping between physical pathnames and namestrings is implementation dependent, but (with certain exceptions discussed in table 1 above) a physical pathname representing a file in a Unix filesystem will have a namestring representation equivalent to the filename used by Unix utilities to name that file. Since Unix is fairly permissive about which characters may be used as pathname components, almost any character other than / (slash) is allowed almost anywhere in a non-logical-pathname namestring. In Allegro, physical pathnames and physical pathname namestrings (again with certain minor exceptions) can represent all possible Unix filenames.

There is also a mapping between logical pathnames and logical-pathname namestrings. Logical pathnames exist so a portable application written in portable code can reference the files it needs to operate, so this mapping is not implementation dependent. It is not the purpose of logical-pathname namestrings to represent all file names possible on an arbitrary host filesystem, e.g. Unix. Rather, logical-pathname namestrings are limited to a reasonable subset of possible filename syntax that can be accommodated by all plausible filesystems. For this reason, characters other than alphabetics, decimal digits, and the minus sign are not supported in "words" of a logical-pathname namestring. The intention is that a large multi-file system should limit its filenames in this way, and then the logical pathname mechanism will guarantee that the software can be ported to other platforms with minimal difficulty.

To encourage portability the Allegro CL implementation will not convert any namestring containing incorrect logical-pathname syntax into a logical pathname. Thus, assuming that expert has been defined as a logical host, this call to pathname

(pathname "expert:;engine;steam-power.lisp")

will return a logical-pathname object with name component steam-power, type lisp, and directory (:relative "engine"). However, this call

(pathname "expert:;engine;steam_power.lisp")

cannot parse the string as a valid logical-pathname namestring and will instead return a physical pathname with name component expert:;engine;steam_power. This is obviously not what was intended; nonetheless, the implementation is not justified signaling an error because this physical pathname is a perfectly legal Unix filename, however unlikely.

Nonetheless it is sometimes necessary for a system to refer to platform-dependent files (perhaps preexisting library files) with non-conforming names. There is no reason not to use the translation services provided by logical pathnames for such files as well, given that such files are platform dependent in the first place. While the pathname implementation will not parse such a namestring as a logical pathname, it is nonetheless possible to construct a logical pathname with arbitrary strings as words, and portably so, using make-pathname. For example, the expert:;engine;steam_power.lisp pathname above could be constructed with a form such as this, assuming logical host expert has been defined:

(make-pathname :host "expert"
   :directory '(:relative "engine")
   :name "steam_power"
   :type "lisp")
[returns] #p"expert:;engine;steam_power.lisp" ; a logical-pathname

This logical-pathname will violate print/read consistency if and when the printed representation is re-read because a physical pathname will be created. Otherwise, it will work just like any other logical pathname. One obvious place where this technique could be useful is in naming pre-existing foreign code system object files and libraries, perhaps in conjunction with defsystem.

6.3 Logical pathnames: some points to note

  1. :device and :version default to :unspecific. (The standard requires :device always to be :unspecific since devices are not supported in logical pathnames.)
  2. In cl:make-pathname, device and version default to :unspecific. An error is signaled if these arguments are not nil or :unspecific (or :newest for version). Hosts are represented as strings in Allegro CL. They are always compared case-insensitively.
  3. cl:make-pathname: returns a logical pathname if host is given and it is `logical', that is if logical pathname translations have been defined for it.
  4. The specification says of cl:parse-namestring:

thing is recognized as a logical pathname namestring when host is logical or defaults is a logical pathname. In the latter case the host portion of the logical pathname namestring and its following colon are optional. If the host portion of the namestring and host are both present and do not match, an error is signaled.

Allegro CL implements the following extension:

Allegro CL recognizes a namestring as a logical pathname in one additional circumstance: the namestring has logical namestring syntax and host is given. In other words, the host need not already have translations defined for it.

  1. cl:merge-pathnames: when pathname is a logical pathname and defaults is not, then Allegro CL does not translate pathname before the merge, so the result can be logical. cl:merge-pathnames returns a logical pathname when: (1) its pathname argument is logical; or (2) defaults is a logical pathname and pathname has no host component.
  2. cl:parse-namestring: a directory sub-component of "**" is parsed as :wild-inferiors and "*" as :wild.
  3. There is a translation for the logical host "sys" which is defined as follows:

(setf (logical-pathname-translations "sys")
      (list
        (list "**;*.*" (namestring <location of exe file>))))

The <location of the exe file> is the location of the executable (typically lisp.exe on Windows and lisp on Unix) called to run Lisp. If the –H command-line argument (see startup.htm) is specified, sys: translates to that location.

  1. Implementation notes for cl:load-logical-pathname-translations: see the description of this function below.
  2. Implementation notes for cl:translate-logical-pathname: if this function is called on a logical pathname for which no translation exists, it will in Allegro CL try calling cl:load-logical-pathname-translations for the host rather than signaling an error immediately.
  3. Implementation notes for cl:translate-pathname and cl:pathname-match-p: In addition to the meanings of * and ** as entire pathname words, standing for :wild and :wild-inferiors, Allegro CL also allows * within a single pathname word, with the normal UNIX globbing interpretation. These capabilities also apply to cl:translate-logical-pathname, which calls these functions. On the right hand side of a translation rule, a * indicates where to substitute the wildcard-matched text. Only one * is allowed within each word on the right hand side of a rule.

6.4: Details of cl:load-logical-pathname-translations

cl:load-logical-pathname-translations

Arguments: host

Certain details of this standard Common Lisp function are explicitly implementation dependent. In all implementations, host must be a string naming a logical-pathname host and this function returns nil if the logical-pathname host named host is already defined. Otherwise, it searches for a logical pathname host definition as follows (This is the implementation-dependent part):

  1. excl:logical-pathname-translations-database-pathnames is called.
  2. It returns a list of strings or pathnames which can be coerced to pathnames naming files.
  3. These files are examined in order (if a file does not exist, it is skipped) until a translation is found. The format of these files is:

host {translation}+

where host is a string naming a logical host (i.e., "src") and translation is a list which specifies the translation (i.e., a list of the source and target, such as

(list "**;*.*" "sys:**;*.*"))

translation is evaluated.

For example, excl:logical-pathname-translations-database-pathnames initially returns the list ("sys:hosts.cl"), which tells the system to look at the files hosts.cl in the Allegro directory (usually the directory containing the executable).

When the hosts.cl file is examined, the values read from it are cached so that further calls to cl:load-logical-pathname-translations do not cause the file to be read. Modifying sys:hosts.cl causes the cache to be invalidated. And all logical-pathname translations are flushed when an image dumped (by excl:dumplisp) is restarted.

Other files besides host.cl can be searched for logical pathname information. The function excl:logical-pathname-translations-database-pathnames returns a list of strings naming files that will be searched for logical-pathname information. It initially returns("sys:hosts.cl"). Additional strings can be added with pushnew and friends, like this:

(pushnew "~/myhosts" (excl:logical-pathname-translations-database-pathnames))

excl:logical-pathname-translations-database-pathnames will now return

("~/myhosts" "sys:hosts.cl")

Copyright (C) 1998-1999, Franz Inc., Berkeley, CA. All Rights Reserved.