$Revision: 5.0.2.5 $
The document introduction.htm provides an overview of the Allegro CL documentation with links to all major documents. The document index.htm is an index with pointers to every documented object (operators, variables, etc.) The revision number of this document is below the title. These documents may be revised from time to time between releases.
1.0 The three profilers:
general information
1.1 The time
profiler
1.2 The space
profiler
1.3
The call-counting profiler
2.0 Invoking the profiler
3.0 Further
points, including information on temporary files
4.0 Controlling the operation of
the profiler
4.1
Control of data collection
4.2 Profiling with multiprocessing
enabled
4.3
Closures in profiler output
5.0 Interpreting the results
5.1
Call-count results
5.2 Space and time profiler results
5.3 Examples
Profiling is used to make Lisp programs more efficient. Profiler information tells you which parts of the program run slowly or use a lot of space (thus increasing garbage collection). Once you know this, you can improve efficiency by optimizing (see the compiler document) or by recoding.
Allegro CL provides three profiling tools for assisting users in determining where time is spent or space is used. Most powerful are the space and time profilers which statistically sample a running program to determine which functions use the most time or which use the most space. The call-counting profiler keeps a count of the number of times a compiled function is actually called while a program is running.
Allegro Composer extends the space and time profilers by providing graphical display of the results as well as the ability to start and stop the profilers graphically. Without Allegro Composer, all the same information is gathered and reported but the information is printed to the terminal.
The Allegro CL Integrated Development Environment in Windows also provides a graphical display of profiler data and various menu commands for starting and stopping profiling. See the description on the Profiler Dialog in the IDE help file.
We describe the three types of profiler below. Later sections describe how to use the profiler and how to interpret the results. Please note the following:
The time profiler collects data by interrupting execution of the Lisp periodically and recording the functions on the execution stack. The variable prof:*granularity* controls how often execution is interrupted. Its value is the integer number of machine clock ticks between samples.
A machine clock tick occurs approximately every 0.01 seconds (the exact value is machine dependent). In that time, the machine will typically execute tens of thousands of machine instructions. The statistical nature of the time profiler should therefore be evident. However, under reasonable assumptions concerning lack of interaction between execution of your program and machine clock ticks and with sufficient samples, the information gathered by the sampling process will closely mirror the actual behavior of the code which is executed.
The `reasonable assumptions' state that (1) there is no cycle in the program harmonic with the program interrupting time so that the interrupting times do provide a random sample of program execution and (2) if your program does several different things, enough time is spent on each to provide statistically significant results. Several thousand samples are statistically sufficient. Of course, functions where your code spends very little time may be missed altogether by the profiler. However, the purpose of the profiler is to assist you in making your code run faster, and functions which use little time are not important in achieving that result.
The space profiler tries to identify the functions which use the most space. Like the time profiler, the space profiler is statistical but the method for sampling is quite different.
The time profiler stops execution regularly and examines the stack. The space profiler stops execution whenever a page of memory is allocated. The statistical nature of the space profiler arises from the fact that the function which calls for a page of memory will not (typically) use it all. After the function which called for the page has completed, there may be sufficient room left on the page for the space requirements of the next function. Therefore, only the first function is charged with needing space even though both functions used the space.
The assumption of the space profiler is that the requirement for new pages of memory occurs randomly and in proportion to the amount of space actually required by a function. So given a reasonably large sample, the functions that use large amounts of space will have proportionally large amounts of space attributed to them by the profiler.
In Allegro CL, memory is organized into three kinds of objects: symbols, conses, and other. Both symbols and conses have a specific size and so a known number can fit on a page. Other Lisp objects require variable amounts of memory (depending on what information they contain). All objects other than symbols and conses are stored on other pages.
The space profiler thus records information whenever one of the functions which allocate a new page of memory is requested. The request can come from the internal function _page_from_new (for symbols and conses) or from the function _other_from_new (for other objects). These functions are deep in the system and are not typically user visible. During profiling, hooks associated with those two functions stop execution when they are called and record information about what is on the stack.
Note that the time macro also provides information on space used by the form evaluated. This information is quite accurate (in some cases, the total space used is more accurately reported by the time macro than by the space profiler). Time does not provide a breakdown of what within a form caused space usage and that is what the space profiler is designed to do.
The call-counting profiler is integrated with the space and time profilers. No special compilation is necessary for calls to a function to be counted (as was necessary in earlier releases). Any function can be call-counted. The count is exact, not statistical. The only `calls' that will be missed are where a function is compiled inline and the inline code is executed, and, on certain architectures, calls triggered by funcall. Note that the overhead for counting calls can be significant, and that overhead is recorded by the time profiler. There is no space overhead, so the space profiler results are unaffected. Therefore, the reported time usage is skewed by the call-counting overhead when a time profile also does call-counting. The time results are still useful, but must be interpreted with care and a time-only profile should not be compared with a time and call-counting profile.
Unless otherwise indicated, the symbols described in this section are in the profiler (nicknamed prof) package.
In order to use the profiler, you need to start it and stop it. You can start it with prof:start-profiler and stop it with prof:stop-profiler or you can use the macro prof:with-profiling to cause the profiler to be run while a specific body of code is executed. The variable prof:*maxsamples* controls the maximum number of samples that will be taken. Profiling will stop when that number of samples has been recorded even if prof:stop-profiler is not called or the code specified to prof:with-profiling has not completed.
Note that it is important to distinguish between starting and stopping the profiler and suspending its operation. prof:start-profiler and prof:stop-profiler are the beginning and end of a profiling run. No data is collected before prof:start-profiler is called and none can be collected after prof:stop-profiler is called. Temporary suspension of data collection is done with the functions prof:start-sampling and prof:stop-sampling. On Windows using the Integrated Development Environment, commands on the Run menu can be used to start and stop profiling and sampling.
The macro prof:with-profiling can be used to profile the evaluation of specific forms. The macro prof:with-sampling similarly can be used to sample only as specific forms are evaluated.
Please note the following about time and space profiling.
prof::*sizefile-template*
and prof::*samplefile-template*
,
whose values are either "/usr/tmp/sizes" and "/usr/tmp/sample" or
"/tmp/sizes" and "/tmp/sample". (The actual data files have filenames
longer than sizes and sample.) You may change the values of these variable if the wrong or
an inappropriate temporary directory is specified. The values must be strings like
"<my-tmp-dir>/sizes" and "<my-tmp-dir>/sample" where
<my-tmp-dir> is the absolute path of the directory you want to use. Profiling error: could not read text section
In 5.0, the reading of the C symbol table into lisp form has been enhanced to not only include symbols from the executable itself (which has itself shrunk to a very small size) but also any shared-libraries that the lisp uses, including the heart of the lisp itself. The function ff:list-all-foreign-libraries lists those libraries whose symbols Allegro attempts to read in.
Sometimes, the shared-libraries do not have the necessary headers and sections that are needed for proper building of a C symbol table in lisp. In these cases, a profiler-error is given, for example:
USER(2): (prof:show-flat-profile) profiler error: couldn't find .text section profiler error: couldn't find .text section
In this case (on a Sparc Solaris 2.6) two libraries did not have .text sections, so no symbols are recorded for those two libraries.
The failure to find symbols is not fatal; the debugging tools that use the libraries are simpley not able to give as complete information as if the symbols had been available. The same soft failure occurs when a symbol in a .so had been declared static (or in the case of Windows 95/NT, any C symbol not explicitly DllExport-ed). The exact failure mode depends on the arrangement of the functions in memory; a function at one address whose symbol was not properly recorded will appear to be part of the next-lower-addressed functiobn whose name was recorded.
The status of the profiler can be determined at any time using the function profiler-status. Called with no arguments, it returns the status of the current profile. The status can be one of the following:
:inactive
-- the profile has not yet started collecting samples:suspended
-- the profile has started sampling, but sampling is currently not being done:sampling
-- the profile has been started and sampling is occurring:saved
-- the profile has been stopped, but as yet no analysis of saved profile data has occurred:analyzed
-- all data of a finished profile has been analyzed
A finalized profile (profiler-status returns :analyzed
) can be saved using
the macro save-current-profile.
It takes a variable name (i.e. a symbol) as an argument and the profile data is made the
value of that variable.
Saving a profile with save-current-profile
allows you to analyze profile data at your leisure, running other profilers in the
meantime. The keyword argument :current-profile
to show-call-graph and show-flat-profile can take a saved
profile instead of the default (which is the currently running or last run profile) and
display results using that profile.
Fairly fine control is provided over the collection of profile data while the profiler
is running. Sampling can be started and stopped with the functions start-sampling and
stop-sampling. The macro with-sampling can be used to sample just while specified forms
are evaluated. Note that a profile started with the function start-profiler may have sampling
started immediately or may have sampling not started (depending on the value of the
keyword :start-sampling-p
).
The variables prof:*maxsamples*
and prof:*granularity* also control
the behavior of the profiler. The first puts a limit on the number of samples collected.
Its initial value is 100,000. The second specifies the time interval between time profile
samples. The units are clock ticks, an internal value usually about 1/100 of a second
(during which time thousands of machine instructions are executed). The space profiler,
which is not statistical, ignored the value of *granularity*
.
When the profiler is running, everything done in Lisp is profiled, including the profiler itself. The profiler overhead (for either space or time) is small. The profiler can be turned on and off (that is suspended and resumed) with the functions prof:start-sampling and prof:stop-sampling.
After the profiler completes its run and stops, the data it has collected is stored within Lisp. If you wish to analyze the data, you must do so before another profile is run (you can, however, begin an analysis and save the profile for later detailed analysis). If after data is collected, a new profile is started, the data collected by the previous profile will be lost.
As an example of how to use these functions, we show how to profile a single function foo that never calls itself embedded in a larger piece of code. Suppose we want to profile foo alone, the steps are as follows:
1. Advise the function foo with :around advice with the following form. (Advice code is described in the advice document.)
(let ((excl:*compile-advice* t)) (excl:advise foo :around timer 0 (prog2 (prof:start-sampling) :do-it (prof:stop-sampling))))
2. Start the profiler with start-profiler with nil as the value of the keyword
:start-sampling-p
. This prepares the profiler without actually beginning sampling.3. Run the code containing calls to foo.
4. Stop profiling when done.
It should be clear why foo cannot call itself: the advice code placed around foo will cause profiling to stop when the embedded call to foo returns even though the original call is still being evaluated. If a function does call itself (or calls another function which calls it back), use the following code:
(defparameter *prof-depth* 0) (let ((excl:*compile-advice* t)) (excl:advise foo :around timer 0 (prog2 (progn (incf *prof-depth*) (if (= 1 *prof-depth*) (prof:start-sampling))) :do-it (decf *prof-depth*) (if (zerop *prof-depth*) (prof:stop-sampling)))))
Profiling more than one function is more complex, because of embedded calls. Unless you are sure that the functions you are interested in never call each other, you must use global variables as flags or semaphores as above to be sure that profiling is done exactly when you want.
This section describes the interaction between the profiler and the multiprocessing capabilities of Allegro Common Lisp. The multiprocessing facility is described in the multiprocessing document. The profiler has the ability to profile all processes or just a selected set.
The profiling of processes can be controlled on a process by process basis with the function mp:profile-process-p which returns the value of the profile-process-p flag of an argument process object. If that flag is nil, profile samples are not collected when the process is running. Setf can be used to change the value of that flag on a process.
The operators start-profiler and with-profiling accept the keyword
argument :except-processes
(on platforms that use OS threads, Windows only in
5.0) or :except-stackgroups
(non-OS-threads paltforms) which allows another
form of control over which processes are profiled. By giving a list of processes as the
value of that argument, you can prohibit the profiler from collecting samples for the
given processes. The Process Browser in Allegro Composer (on UNIX) provides a way to set
this flag with the mouse.
The profiler has been enhanced to separate closure calls. prof:start-profiler and prof:with-profiling have a new keyword
called :interpret-closures
. When :interpret-closures
is nil
(the default) the closures are not saved and analyzed separately so
the behavior is the same as in earlier releases.When non-nil,
closures are saved and analyzed as separate profile entries, even though they may use the
same template function.
A function spec called prof:closure
has been added (see implementation.htm for information on function specs). This
is valid whenever an analyzed profile exists and identifies the closures which the
profiler has seen. The function spec is (prof:closure n)
where n
is a small integer enumerating the closures in the profile. This allows the user to grab
the closure object and to do a prof:disassemble-profile
on it without having to know what symbol holds its fdefinition
(sometimes
this is very hard).
When flat profiles and call-graphs are reported, closures are identified by (prof:closure
n)
, followed by the printed representation of the closure object itself.
Space and time profiler results can be displayed graphically in windows if you are running Allegro Composer but can only be printed to the terminal if you are not. Graphical display on UNIX is described in chapter 6 of the Allegro Composer User Guide. Graphical display on Windows is described on the WinHelp page for the Profiler dialog. The system will display space or time data automatically according to what kind of data was collected.
Note that you should not run additional profiles once you have started to display profile results on the terminal. Only do another profiler run when your analysis is complete. Note too that you can save the results of a profile with save-current-profile.
You can look at the data in two ways: a flat profile and a call-graph. A flat profile shows the time or space information sorted by function from largest consumer of time or space to least consumer.
The call-graph format shows information about the callers and callees of each function grouped with the information about the function.
The two functions prof:show-call-counts and prof:list-call-counts report call-count results. The first prints the results and the second returns them in a list. (See the definitions of the functions for information on specifying a profile other than the current profile and controlling the number of functions about which information is printed.)
The function prof:show-flat-profile prints a flat profile to *standard-output* (or another stream if you desire). There is an example of a flat-profile output below.
The function prof:show-call-graph allows you to examine which functions have been called during the profile run. Here is how it works.
For each function, F, a call-graph shows the functions that called it (parents of F) and the functions that it called (children of F). Roughly speaking, a call-graph is the flat-profile display expanded to show parents and children. The display for function F is surrounded by dashed lines. F's display shows first information about the parents of F, then information about F itself, then information about F's children. (See the example later in this section.)
The following algorithm is used to find the parents of a function F. Each profiler sample is examined and if F appears in the control stack, the function calling F is recorded as the parent. If F appears more than once in the control stack, all the different parents (callers) are recorded, but if the same caller calls F several times in the stack, only the oldest instance is recorded. The time spent in (or the space used by) F and its descendents due to calls by a particular parent is proportional to the ratio of the number of samples where this particular parent of F was found to the total number of samples.
The algorithm for finding children is as follows. Each profiler sample is examined and if F is in the control stack, the function it calls is recorded as its child. If F appears more than once in the stack, each different child is recorded but if the same child appears more than once, only the oldest instance is recorded. The time spent in (or the space used by) a particular child and its descendents due to calls by F is proportional to the ratio of the number of samples where this particular child of the function F was found to the number of samples.
If there is a recursive entry in a call-graph, then the local % entry for that function will have an r after it. If there is a recursive entry, then time attributed to this entry may be counted twice because of recursion. For example, if bar calls baz calls foo calls baz, then foo and bar are parents of baz, and there will be two entries for parents of baz, corresponding to only 1 clock tick (sample). So the report shows two lines (one for foo and one for bar), but the time is double counted. Similarly for a child case: bar calls foo calls bar calls baz. Then foo and baz are both children of bar (reported on 2 separate lines in the profiler report) corresponding to only 1 sample.
The output of a flat profile and particularly of a call-graph may be voluminous. In order to reduce the amount of data to a manageable amount, the default display of either a flat profile or a call-graph suppresses some internal functions and packages. When a package is hidden, functions named by symbols in the package are not displayed. The functions and packages suppressed are in the lists prof:*hidden-functions* and prof:*hidden-packages*, respectively. You can suppress display of extra functions by adding them to the list that is the value of *hidden-functions*. You can suppress the display of extra packages (that is, of functions named by symbols in the packages) by adding them to the list that is the value of *hidden-packages*.
When functions are suppressed from the display, the profiler attributes any effects these functions might have had to the function that called them. So, assuming the calling function is itself not suppressed from the display, the calling function in the profiler display can be thought of as a kind of composite, made up of itself and the suppressed function(s) it called. Functions which the profiler records as calling hidden functions are indicated in the profiler display by prepending "..." to the function name.
In a call graph all directly called undisplayed functions are attributed to the caller. Such undisplayed functions may eventually directly or indirectly call displayed functions, which will be shown in the call graph as children of the original caller. I.e., if foo calls bar calls baz, and bar is suppressed, then foo will be displayed as "... foo" and baz will be shown as a child of foo. foo gets dots because it can be thought of as a composite of foo and bar (as we explained above).
Sometimes a called function and all its descendants may be suppressed (e.g., none of them are displayable), in which case the profiler display just uses "..." to indicate that there are completely suppressed cases.
You can further control the amount of analysis you see with the variables *significance-threshold* and *fractional-significance-threshold*.
The variable prof:*hidden-space-functions* list functions (mostly kernel functions named with strings rather than symbols) whose suppression makes the space profile output more understandable. The functions mostly actually allocate space requested (directly or implicitly) by other functions presumably of more interest.
Note several things about the display control variables we have just discussed.
You can always reset the value of the five display control variables, prof:*hidden-packages*, prof:*hidden-functions*, prof:*hidden-space-functions*, prof:*sigificance-threshold*, and prof:*fractional-significance-threshold*, with the function prof:reset-profiler-display-defaults.
The function prof:disassemble-profile will print a disassembly of a specific function indicating the estimated time spent for each instruction. Calls to other functions are identified as jumps to subroutines (using hardware specific terminology). We remind you that the time profiler data is statistical and so the time spent in particular instructions are estimates. The less time spent in a function overall, the worse the estimate.
The first example uses the time profiler. We will profile the following functions, which we assume have already been defined and compiled within Lisp.
(defun fact (x) (if (= x 1) 1 (* x (fact (- x 1))))) (defun bar (x) (dotimes (i 2000) (make-array x))) (defun foo (x) (fact x) (bar x))
First we generate the profile by evaluating the following form:
(prof:with-profiling (:type :time) (foo 2000))
We display the flat profile by evaluating the following form:
That causes the following to be printed (we have used a wide display format to prevent line wrapping):
Sample represents .9 seconds of processor time (out of a total of .9) Times below 1.0% will be suppressed. % % self total self total Function Time Cum. secs secs calls ms/call ms/call name 46.2 46.2 0.4 0.4 "bignum_multiply" 22.6 68.8 0.2 0.2 "make_svector_common" 22.6 91.4 0.2 0.2 "sigprocmask" 7.5 98.9 0.1 0.1 "write"
The first question you might have is where are fact, foo, and bar (the functions we defined at the beginning of the example)? A flat profile displays entries ordered by the (estimated) time a function is at the top of the stack, not on the stack. The three functions we defined call the low-level functions listed to do their work and so themselves spend little time at the top of the stack, so little that it falls below the default value of *significance-threshold* (0.01, equivalent to 1%). If you printed this display with the value of *significance-threshold* 0.0, you would see foo, bar, and fact (and quite a few more functions) listed.
We can see from the output that most of the time is spent in the Lisp runtime-system functions bignum_multiply and make_svector_common. (We know they are runtime-system functions because they are named by strings.)
Now let us look at the call graph. We display it by evaluating the form:
Here is the call graph that is printed:
Time profile of sampled pc values by function, children, and parents. Total times below 1.0% will be suppressed. Parent and child times less 2.0% of the node will be suppressed. Sample represents .9 seconds of processor time (out of a total of .9) % % Parent self total total local Function Time Time secs % Child 0.0 100.0 0.9 0.0 ... "start" 0.7 73.1 ... "_exit" 0.2 19.4 "sigprocmask" 0.1 7.5 "write" ----------------------------------------------------- 0.7 100.0r ... "start" 0.0 4.4r "make_svector_common" 0.0 73.1 0.7 0.0 ... "_exit" 0.7 100.0r ... EVAL 0.0 4.4r "sigprocmask" ----------------------------------------------------- 0.7 100.0 ... "_exit" 0.0 73.1 0.7 0.0 ... EVAL 0.7 98.5 ... FOO ----------------------------------------------------- 0.7 100.0 ... EVAL 0.0 72.0 0.7 0.0 FOO 0.4 64.2 FACT 0.2 35.8 BAR ----------------------------------------------------- 0.4 100.0r FACT 0.4 100.0r FOO 0.0 46.2 0.4 0.0 FACT 0.4 100.0r "integer_multiply" 0.4 100.0r FACT ----------------------------------------------------- 0.4 100.0 FACT 0.0 46.2 0.4 0.0 "integer_multiply" 0.4 100.0 "bignum_multiply" ----------------------------------------------------- 0.4 100.0 "integer_multiply" 46.2 46.2 0.4 100.0 "bignum_multiply" ----------------------------------------------------- 0.2 100.0 FOO 0.0 25.8 0.2 0.0 BAR 0.2 100.0 "make_svector" ----------------------------------------------------- 0.2 100.0 BAR 0.0 25.8 0.2 0.0 "make_svector" 0.2 100.0 "make_svector_common" ----------------------------------------------------- 0.2 100.0 "make_svector" 22.6 25.8 0.2 87.5 "make_svector_common" 0.0 12.5 "_exit" ----------------------------------------------------- 0.2 85.7 "start" 0.0 14.3 "_exit" 22.6 22.6 0.2 100.0 "sigprocmask" ----------------------------------------------------- 0.1 100.0 "start" 7.5 7.5 0.1 100.0 "write" ----------------------------------------------------- 0.0 100.0 ... EVAL 1.1 1.1 0.0 100.0 PROF:STOP-PROFILER -----------------------------------------------------
Look at the entry for foo (fourth down). It tells us that foo spends much more time in fact (64.2%) than it does in bar (35.8%). Looking at the entry for fact, the "r" next to the local percent numbers indicate that this function has been called recursively.
To show call-counting, we call for a time profile again but specify :count 100 and :count-list (list #'fact #'bar #'make-array), so the call to with-profiling is
(prof:with-profiling (:type :time :count 100 :count-list (list #'fact #'bar #'make-array)) (foo 2000))
Here is the flat profile produced by prof:show-flat-profile and the printout of prof:show-call-graph.
Sample represents .9 seconds of processor time (out of a total of .9) Times below 1.0% will be suppressed. % % self total self total Function Time Cum. secs secs calls ms/call ms/call name 27.9 27.9 0.3 0.3 "sigprocmask" 25.0 52.9 0.2 0.6 "make_svector_common" 25.0 77.9 0.2 0.3 "bignum_multiply" 17.6 95.6 0.2 0.2 "write" 1.5 97.1 0.0 0.3 "integer_multiply" 1.5 98.5 0.0 0.0 3999 0 0 EXCL::*_2OP 1.5 100.0 0.0 0.0 EXCL::DEFAULT-GC-AFTER-HOOK USER(18): (prof:show-call-counts) 3999 #<Function *_2OP> 2000 #<Function CALC-ARRAY-SIZE> 36 #<Function CLASS-OF> 19 #<Function SPEC-CLOSURE-OBJECT-P> 18 #<Function GC-AFTER> 18 #<Function ARGLIST> 18 #<Function AUTOLOADP> 18 #<Function LIST-POSITION-EQ> 18 #<Function TYPEP-CLASS> 18 #<Function ENCAPSULATED-BASIC-DEFINITION> 18 #<Function RESIDENT-FUNCTION> 18 #<Function MACRO-CLOSURE-OBJECT-P> 18 #<Function SLOT-VALUE--SI> 18 #<Function CHECK-WRAPPER-VALIDITY-1> 18 #<Function INTERNAL-CLASS-OF> 18 #<Function PROCESS-FINALIZATIONS> 18 #<Function POP-FINALIZATION> 18 #<Function SLOT-VALUE> 18 #<Function LENGTH> 3 #<Function FBOUNDP> 3 #<Function %EVAL> 2 #<Function GET_2OP> 1 #<Function FOO> 1 #<Function BAR> 1 #<Function FACT> 1 #<Function INSTANCE-WRITE-11> 1 #<Function STOP-SAMPLING> 1 #<Function STOP-PROFILER> 1 #<Function SET-TRAMP> 1 #<Function INCREMENT-CALL-COUNTS> 1 #<STANDARD-GENERIC-FUNCTION PROFILER::PROFILE-STATUS> 1 #<Function EVAL-AS-PROGN> 0 #<Function MAKE-ARRAY> NIL
Note that the call counts are by function object, not function name, that the call-counting overhead in this case is not very significant -- see the comparable result without call-counting above -- (but in other cases it might be), and that runtime-system functions (named by strings in the flat profile) are not counted.
We end with an example of a space profile. We use the following code, which (1) uses a lot of space and (2) uses different types of space. Consider the following functions:
(defun foo (n) (let ((array (bar n))) (dotimes (i (* n 8000)) (setf (aref array i) (cons nil nil))))) (defun bar (n) (make-array (* n 8000)))
We assume these functions have been defined and compiled within Lisp. Note that bar creates an array (which will trigger calls to _other_from_new) and foo does a lot of consing (which will trigger calls to _page_from_new).
We call these functions in a with-profiling form:
(prof:with-profiling (:type :space)(foo 1))
Before looking at the results, we set *hidden-functions* to nil, so we can see what system functions are actually allocating the space.
USER(26): (setf prof:*hidden-functions* nil)
We print the results with show-flat-profile. At first, we see foo and bar but no internal functions (the suspension points `...' mean that functions called by foo and bar are using the space). Then we set *hidden-space-functions* to nil and we see the runtime-system space functions actually called.
Sample represents 88.0 Kbytes of space allocated (out of a total of 88.0) Mem.s below 1.0% will be suppressed. % % self total self total Function Mem. Cum. Kbyte Kbyte calls by/call by/call name 63.6 63.6 56 88 ... FOO 36.4 100.0 32 32 ... BAR USER(18): (setq *hidden-space-functions* nil) NIL USER(19): (show-flat-profile) Sample represents 88.0 Kbytes of space allocated (out of a total of 88.0) Mem.s below 1.0% will be suppressed. % % self total self total Function Mem. Cum. Kbyte Kbyte calls by/call by/call name 63.6 63.6 64 64 "_page_from_new" 36.4 100.0 32 32 "new_other"
The first thing to notice is that _page_from_new and _other_from_new use all the space. This is not surprising since they actually allocate the space. Note that _page_from_new uses 56K (8,000 conses at eight bytes per cons) and _other_from_new uses 32K (8,000 entries in an array using four bytes each). No other functions appear since all the space is allocated by these functions.
Now we look at a call graph by calling show-call-graph.
Parent and child mem.s less 2.0% of the node will be suppressed. Sample represents 96.0 Kbytes of space allocated (out of a total of 96.0) % % Parent self total total local Function Mem. Mem. Kbyte % Child 0.0 100.0 96 0.0 ... "start" 96 100.0 ... EVAL ----------------------------------------------------- 96 100.0 ... "start" 0.0 100.0 96 0.0 ... EVAL 96 100.0 ... FOO ----------------------------------------------------- 96 100.0 ... EVAL 0.0 100.0 96 0.0 FOO 64 66.7 "cons" 32 33.3 BAR ----------------------------------------------------- 64 100.0 FOO 0.0 66.7 64 0.0 "cons" 64 100.0 "_cons_from_new_page" ----------------------------------------------------- 64 100.0 "cons" 0.0 66.7 64 0.0 "_cons_from_new_page" 64 100.0 "_conspage_from_new" ----------------------------------------------------- 64 100.0 "_cons_from_new_page" 0.0 66.7 64 0.0 "_conspage_from_new" 64 100.0 "_page_from_new" ----------------------------------------------------- 64 100.0 "_conspage_from_new" 66.7 66.7 64 100.0 "_page_from_new" ----------------------------------------------------- 32 100.0 FOO 0.0 33.3 32 0.0 BAR 32 100.0 "make_svector" ----------------------------------------------------- 32 100.0 BAR 0.0 33.3 32 0.0 "make_svector" 32 100.0 "make_svector_common" ----------------------------------------------------- 32 100.0 "make_svector" 0.0 33.3 32 0.0 "make_svector_common" 32 100.0 "new_other" ----------------------------------------------------- 32 100.0 "make_svector_common" 33.3 33.3 32 100.0 "new_other" -----------------------------------------------------
Copyright (C) 1998-1999, Franz Inc., Berkeley, CA. All Rights Reserved.