Homework 5 - CS186

CS186: Introduction to Database Systems
UC Berkeley

November 18, 2003
Due: Monday, December 8, 2003, at 8PM PST

Note: this will be a group project!

1 Introduction

As with most modern database systems, Postgres provides a number of ways to extend it, both from within Postgres and SQL. The purpose of this exercise is to famiarize you with C language extensions to postgres, as well as to give you some practice with external sorting. You will be writing some simple C functions that will be accessable from within Postgres, and also writing some more complicated aggregation functions that will be added to Postgres.

2 Background: C language extensions to Postgres

Postgres permits a programmer to define a C language function as accessable from SQL. I gave a demonstration of this in class. The postgres online documentation contains the PostgreSQL Programmer's Guide. Of relevence to this assignment is the section on Server Programming, especially chapters 12 (Extending SQL Functions), and chapter 15 (Extending SQL: Aggregates). Read and understand these sections.

3 Sample Code Provided With This Assignment

Makefile

For postgres to access a function written in C, the source code must be compiled into a shared library that postgres loads at runtime. I have provided a simple makefile that will take any .c file and compile it into a shared library with the .so suffix. For example, if you have a file called 'test.c', you can make a shared library 'test.so' by typing 'make test.so'. The compiler arguments are correct for running on Pentagon or Rhombus. If you want to run on some other architecture, you'll have to figure out the correct method for compiling on that architecture.

loadDB.sh

This script creates a database called 'hw5' and loads some sample data into into a table called numtest.

schema.sql

This script is used by loadDB.sh to create and load the table of test data.

testData

This file contains data for the numtest table.

addOne.c

This file contains two sample functions that you can add to postgres. These functions add one to either an integer or floating point number.

arrayFunc.c

This file contains a stub for Step 1, a function for working with array parameters.

simpleAggr.c

This file contains a stub for Step 2, which involves writing a simple aggregate operator.

medianAggr.c

This file contains a stub for Step 3, which involves writing a median aggregate operator.

medianAggrExt.c

This file contains a stub for Step 4, which involves writing a median aggregate operator that uses external sorting in files.

Step 0: Set up your environment, load the schema, try adding some functions

For this assignment you will be using the standard installed version of postgres. You will not need to compile postgres. The only thing you will be compiling are shared libraries that postgres will load at runtime. The steps to set up your environment are as follows:

Log in to pentagon or rhombus.
Extract the files for the assignment by typing:

gtar -xvzf ~cs186/fa03/Hw5/Hw5.tar.gz

The previous command creates a Hw5 directory. Go to that directory:

cd Hw5

Start postgres by typing:

pg_ctl start

create a database with sample data by typing:

./loadDB.sh

compile the sample functions in addOne.c:

gmake addOne.so

start up postgres, and try adding the functions that were defined in addOne.c. Be sure to change cs186-xx in the path to your cs186 login.

psql hw5
cs186-xx=# CREATE FUNCTION add_one(int4) RETURNS int4
cs186-xx-# AS '/home/cc/cs186/fa03/class/cs186-xx/Hw5/addOne' LANGUAGE C
cs186-xx-# WITH (isStrict);
cs186-xx=#
cs186-xx=# CREATE FUNCTION add_one(float8) RETURNS float8
cs186-xx-# AS '/home/cc/cs186/fa03/class/cs186-xx/Hw5/addOne','add_one_float8'
cs186-xx-# LANGUAGE C WITH (isStrict);
cs186-xx=#
cs186-xx=# cs186-ab=# select add_one(19);
add_one
---------
20
(1 row)
cs186-xx=#
cs186-xx=# select add_one(1.9);
add_one
---------
2.9
(1 row)

cs186-ab=#

Now try and change the addOne functions. Any time you quit and restart psql, the shared library can be reloaded. Thus, if you want postgres to pick up a new version of addOne.so, just quit from psql, recompile addOne.so, and researt psql. Don't change the names of the functions, just change the amount they add, e.g., have them add two or three instead of just one to the numbers they were passed. Then recompile, start up psql, and repeat the above steps to make sure that your changes were picked up.

If you make it through these steps, you are ready to try writing your own functions to extend postgres.

Step 1: Write a Simple Function Using Arrays

Postgres supports the use of arrays as data types. This data type will prove useful in steps 2, 3, and 4. This step lets you see how parameters of type array work.

The file arrayFunc.c contains a stub of a function that accepts a single parameter that is an array of integers, and returns a single integer result. To see how parameters of type array work:

compile arrayFunc.c:

gmake arrayFunc.so

start up postgres, and add the 'arrayFunc' function to postgres.

psql hw5
cs186-ab=#   create function arrayFunc(int4[]) returns int4
cs186-ab-#     as '/home/cc/cs186/fa03/class/cs186-ab/Hw5/arrayFunc','arrayFunc'
cs186-ab-#    language C with (isStrict);
CREATE
cs186-ab=# select arrayFunc('{1,2,3}');
arrayfunc
-----------
         0
(1 row)

cs186-ab=#

Note the syntax that postgres uses for specifying a constant array: '{1,2,3}'

For this part of the assignment, you need to change the code is arrayFunc.c to have it look through the array that was passed to it, and return the highest value. The code shows how to find out from the array parameter the number of dimensions (postgres supports multi-dimensional arrays), the size of each dimension, and the actual data. Given the definition of the function in postgres, you may assume the array has no more than one dimension. You must make sure to handle the case where the array is empty, '{}', in which case the number of dimensions is zero. Hint: the macro PG_RETURN_NULL() returns null, which is what you should return as the largest value in an empty array.

Step 2: Write a Simple Aggregation Function

Once you have read the postgres documentation on aggregate functions, you will know that an aggregate is defined in terms of two functions:

the transition function, which is called for each value being aggregated. It returns a data object used to keep track of the state of the aggregation, and is passed in the next call of the transition function, and to the final function. For example, an aggregation performing AVG might use an array of two real numbers to keep track of the count and sum of all the elements seen so far. (Or, to reduce chances of overflow, the running average and count so far.)
the final function, which is called at the end of the aggregation, and which returns the result of the aggregation.

The file simpleAggr.c contains stubs of a transition and final function for an aggregation operator. Your task is to turn these stubs into an aggregation operator that finds the 5 greatest values in the set of aggregated values. As such, the transition function will need to pass along an array giving the 5 greatest values seen so far. If fewer than 5 values have been seen so far, it should pass along a shorter array. The values should be listed from greatest to smallest. You should ignore duplicate values. Thus, if the data set contains {1,2,3,3,4,5,6}, your result should be {6,5,4,3,2}, To define the aggregate function, you will do the following:

compile simpleAggr.c

gmake simpleAggr.so

start psql, and define the individual functions, and also the aggregate that uses those functions. You can test the aggregate on the numtest table.

psql hw5
cs186-ab=# create function simple_aggr_transition(int4[], int4) returns int4[]
cs186-ab-#    as '/home/cc/cs186/fa03/class/cs186-xx/Hw5/simpleAggr',
cs186-ab-#    'simple_aggr_transition_i4' language C with (isStrict);
cs186-ab=#
cs186-ab=# create function simple_aggr_final(int4[]) returns int4[]
cs186-ab-#    as '/home/cc/cs186/fa03/class/cs186-xx/Hw5/simpleAggr',
cs186-ab-#    'simple_aggr_final_i4' language C with (isStrict);
cs186-ab=#
cs186-ab=# create aggregate simpleAggr(basetype = int4,
cs186-ab-#                              sfunc = simple_aggr_transition,
cs186-ab-#                              stype = int4[],
cs186-ab-#                              finalfunc = simple_aggr_final,
cs186-ab-#                              initcond = '{}');
cs186-ab=#
cs186-ab=# select max(num1), avg(num1), simpleAggr(num1) from numtest;

For your convenience, I have included a function called 'makeOneDInt4Array(int count, int *elements)', which takes in a regular array of integers, and a count of the number of integers in the array, and it returns the complicated data structure that postgres requires as an array.

In this example, the final function is actually not necessary. All it need to is return the array of the 5 largest values seen so far. But I've left it in since you'll need to understand the final function in the next step.

Step 3: Write an In-Memory Median Function

One aggregate operator that is not part of postgres is median. The median value of a set of numbers is defined as the value for which there are an equal number of values above and below. (If the number of values is even, the median is defined as the average of the two values on either side of the middle.) With the median function, you must include duplicate values (unlike step 2). The easiest way to compute median is to sort all the values, and find the middle one (or average the two surrounding the middle, if the number of values is even). Your task in step 3 is to use the stubs in medianAggr.c to create an aggregate function that computes median. You are to do this by creating an in-memory array containing all the values, sorting it, and finding the middle value (or values). The transaction function's job is to take each new value and to create a new array containing all previous values, plus the new value. The final function is passed the array of all values, and must use it to compute the median. Note that even though the aggregate applies to integers, the return value from the final function should be a float8, since the median might be the average of two numbers.

Step 4: Write a Median Function with External Sorting

One problem with the approach is step 3 is that it keeps all the members of the array in memory at once. For a very large data set this could cause thrashing and bring the computer to its knees. A more general solution would be to write out the values to temporary files whenever the array gets too large, then to use merge/sort to put the values in order and compute the median that way. You task in step 4 is to fill in medianExtAggr.c with code to compute median using external sorting.

The way to add an external sort is to start accumulating values in an array in memory, as in step 3. When the array gets to big, you will create a temp file using the Postgres Temporary File functions described below, and write out the contents of the in-memory array. How can you know, when you get to the final function, if you've written out values to a file? When you create a temporary file, the BufFileCreate() function returns a BufFile pointer. This pointer, luckily, is the same size as an int4. So it is possible to store the BufFile pointer in the same array that the transition function uses for keeping integers. I recommend using the first element of the array to store a BufFile pointer, and have it initialized to 0 indicating that you haven't written anything to a file yet. You might also want to use the second element of the array to indicate how many values have been written to the file yet. Note that the "CREATE AGGREGATE" command allows you to specify the initial value to be passed to the transition function the first time. You could specify '{0,0}' as the initial value if you want to store a file pointer and value count in the first two slots of the array.

(Note: later versions of Postgres support complex data types, so with Postgres 7.5 we could keep a file pointer as a separate field of the data structure passed from one transition function to the next, but with Postgres 7.2 we're stuck with this hack.)

You must also read below about Postgres memory contexts. Any memory allocated in the default context in the transition function will only last until the subsequent call of the transition function. As a result, you do not want to allocate a temporary file in the default context, because it will be gone two transition functions later. The context that will last until the end of the transaction is "TransactionCommandContext", so you should switch to this context before creating the timp file, then switch back.

What value should be the maximum number of values before you create a temporary file? For now, use the value 1024, but this should be a #define in your code so you can change it to a different value. Similarly, when you are sorting runs of integers in the first pass, sort 1024 at a time. When you are merging temp files together, postgres takes care of buffering the file I/O, so just read one int4 at a time from each temp file, and postgres will be smart about reads from disk. Given this buffering, to make sure that the merges don't require too much memory, don't merge more than 64 files at once. (And again, the value 64 should be set with a #define so you can change it if needed.)

Step 5: Performance Testing

You will want to test your code on large numbers of values. I have created a number of sample data files containing a single column of random integers, though with some skew added to ensure that the median and mean are not the same. These files are in ~cs186/fa03/Hw5/sampleData-* where "*" is the number of values in the file. I recommend working with about 50,000 values, though you probably have enough disk space for up to 500,000 values. You will want to do the following:

NEW: I have been informed that local storage is available only on rhombus. If you look at /home/tmp2, you will find a directory for each class account. You should use this for all your performance testing, since the NFS mounted file systems greatly affect performance.
Run initdb to use the temp directory. E.g.:

initdb -D
/home/tmp2/cs186-xx

Alternately, you can set the PGDATA environment variable to /home/tmp2/cs186-xx, and then you don't need to specify the -D option.
Start postgres using the temp directory:

pg_ctl start -D /home/tmp2/cs186-xx

Create a database:

createdb hw5

Start psql, and create a table with one integer column, and copy values into it from a file:

psql hw5
hw5# create table numtest(num1 integer);
CREATE
hw5=# copy numtest from '/home/ff/cs186/fa03/Hw5/sampleData-50001' using DELIMITERS '|';
COPY
hw5=#

Now make sure you have defined the functions and aggregates for median, both in-memory and with external sorting.
To do performance testing, you will need to run postgres in the backend mode. You must stop regular postgres before starting the backend in standalone mode:

pg_ctl stop -D /home/tmp2/cs186-xx
~cs186/pgsql-7.2.2/src/backend/postgres -s -D /home/tmp2/cs186-xx hw5

Find the median using the in-memory or external sorting versions of the median function, e.g.

backend> select median(num1) from numtest;
blank
         1: median      (typeid = 701, len = 8, typmod = -1, byval = f)
        ----
Got #values: 10001
         1: median = "548380742"        (typeid = 701, len = 8, typmod = -1, byval = f)
        ----
DEBUG: QUERY STATISTICS
! system usage stats:
!       1.629695 elapsed 1.000000 user 0.030000 system sec
!       [1.000000 user 0.400000 sys total]
!       29/0 [65/66] filesystem blocks in/out
!       29/0 [65/0] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!       75/142 [295/158] voluntary/involuntary context switches
! postgres usage stats:
!       Shared blocks:         75 read,          0 written, buffer hit rate = 81.97%
!       Local blocks:          0 read,          0 written, buffer hit rate = 0.00%
!       Direct blocks:          0 read,          0 written

I have found that with 50,000 values, the external sort version of median takes about 2 - 3 seconds of user & system time, whereas the in-memory version takes about 60-70 seconds. Your mileage will vary, of course. Also note how the external version has more voluntary context switches (because the code is waiting on I/O requests), but fewer involuntary ones. Yet with only 2001 values, the in-memory one is about four times faster than the external sorting one. This is not at all suprising, since with a small number of values, it's better not to incur the overhead of writing to files, but with many values the swapping caused by allocating lots of memory slows things down a lot. It is possible to find a compromise between in-memory and external sorting. You can find this by changing the symbol for the max number of values that you #defined in your code to be 1024. If that value is larger, then the code will wait longer before writing values out to a file.

Your mission for step 5 is as follows:

Load the database with the 50000 value sample file.
With the max number of values in the array to 1024, compile the external sorted median code.
Start up the postgres backend (as described above), and record the average elapsed, user and system time to execute the median function across three runs.
Stop the postgres backend, recompile the median code with max values set to 4096, and repeat.
Repeat for max values of 8192, 16384, 24576, and 32768.
Produce a text file called RESULTS.TXT containing the tab-delimited columns giving the average times for each max values, of the following format. The numbers below are made up, yours will vary.

1024 3.685 0.940 0.969
4096 5.533 4.334 0.999
...etc

At the bottom of the file, on a separate line, say which value you think was best.

Notes on Postgres Temporary Files

To write the integers to disk, you will need to be able to open, read, write and close files. Rather than using the standard C file I/O library, we require you to use functions within PostgreSQL. Specifically, the BufFile interfaces that are declared in postgresql-7.2.2/src/include/storage/buffile.h. These functions are attractive because you don’t have to deal with file names and you get buffered I/O for free.

The interface is almost identical to the standard C library functions (i.e., fopen() and tmpfile(), fclose(), fseek()). Instead of FILE * you will work with BufFile * pointers, but everything else remains essentially the same. Note: You have to be careful about “memory context” in which BufFile structures are allocated. More details in subsection 4.6. You should only need to use the following functions:

BufFileCreateTemp(void): This is the equivalent of tmpfile(void) of the C standard library. It will return a pointer to the associated file structure, which is allocated in the current memory context.
BufFileSeek(BufFile *file, int fileno, long offset, int whence): This is the equivalent of fseek(). The only extra argument is fileno. The operating system limits the maximum size of a single file (typically to 231 bytes, i.e. 2 Gbytes for 32-bit architectures). However, a database file can grow larger than the maximum physical file size, so PostgreSQL implements BufFiles as a collection of physical disk files. So file offsets are specified by a fileno, offset pair (where offset refers within the fileno-th physical file). In any case, you will only have to seek to the beginning of the file, which you can do with BufFileSeek(fp, 0, 0L, SEEK SET).
BufFileClose(BufFile *file): The equivalent of fclose(), will close the file and de-allocate the space occupied by the file structure. As soon as you close a file, it is deallocated and its contents disappear. Do not close a file until you want it to go away.

BufFileRead(BufFile *file, void *ptr, size t size): This is the equivalent of fread() and will read size number of bytes from file into ptr. As with the C standard library, you have to make sure that you do not overrun the space pointed to by ptr.
BufFileWrite(BufFile *file, void *ptr, size t size): This is the equivalent of fwrite() and will write size bytes from ptr into file.

Managing memory with contexts

PostgreSQL uses its own memory manager, which performs functions similar to malloc and free that you are familiar with. However, instead of allocating memory from a single global heap, the PostgreSQL allocators work off an abstraction called memory contexts for convenience and performance. Essentially, each allocated chunk of memory belongs to a particular context and all chunks can be deallocated in one shot. Imagine that at some point you allocate a relatively complex data structure (e.g., a linked list, or a tree, or a hashtable !). Normally, in order to free the memory it uses, you would have to traverse the structure and free each node individually. This traversal is both tedious to write and expensive to perform. Instead, PostgreSQL allows you to do the following:

treeCxt = AllocSetContextCreate(...); /* Creates and assigns a new context to treeCxt */
while (...) { /* Allocate sizeof(TreeNode) bytes from treeCxt */

newNode = MemoryContextAlloc(treeCxt, sizeof(TreeNode));

...

}
MemoryContextDelete(treeCxt); /* Free the entire treeCxt context */

For convenience, PostgreSQL provides the functions palloc and pfree which operate on a default memory context (called the current context and pointed to by the CurrentContext global variable). Thus

ptr = MemoryContextAlloc(cxt, n); /* Alloc n bytes from the cxt memory context */

is exactly equivalent to

oldCxt =
MemoryContextSwitchTo(cxt) /* Current context:save in oldCxt,switch to
cxt */

ptr = palloc(n) /* Allocate n bytes from current (cxt) context */

MemoryContextSwitchTo(oldCxt) /* Restore current contextto oldCxt */

If you are performing several allocations, this can save you some typing. You can change the current context using MemoryContextSwitchTo(cxt). It is your code’s responsibility to properly manage (i.e., set and restore) the current memory context. Be careful to avoid insidious bugs, in which you forget to reset the memory context to what it was, and confuse another piece of the code !

Finally, memory contexts are organized in a hierarchy. When creating a new context with AllocSetContextCreate(), the first argument is the parent context. Deleting a context also deletes all its child contexts at the same time. The file postgresql-7.2.2/src/backend/util/mmgr/README contains a detailed description of the PostgreSQL memory manager, should you need it, although the information above should suffice.

Notes on debugging postgres

One problem with extending postgres with your own code is that often times mistakes in your code simply cause the postgres backend to crash. The way to debug this is to run postgres in backend mode:

stop postgres:

pg_ctl stop

launch the backend in the debugger:

gdb ~cs186/pgsql-7.2.2/src/backend/postgres

run -s hw5

then type psql commands, and when the backend crashes you can look at the stack trace, and look at the value of variables.

Also, if you want to do printf style debugging. My experience shows that doing 'fprintf(stderr,"Hello World");' seems to work when running either in backend mode or in regular mode.

What to turn in.

You should submit your versions of arrayFunc.c, simpleAggr.c, medianAggr.c, and medianExtAggr.c. You should *not* have changed the names of any functions in those files, otherwise the autograder may not work. You should also submit RESULTS.TXT, and a README file containing the names of the people in your group.

Homework 5: Extending Postgres