CS61C Project 1-- Fall 2006

61C-FS - An in-memory file system

Due: 11:59pm on Saturday, September 23rd
Project TA
: Aaron Staley

THE SIMPLER (NO SOFTLINKS) PROJECT IS HERE!
VERY IMPORTANT: BE SURE TO PUT IN YOUR README WHETHER YOUR PROJECT SUPPORTS SOFTLINKS!

This is an individual assignment; your code must be your own.

FINAL UPDATE: It seems that the fix for the memory leak in mv was incorrect. Please see the blue update for the correct fix (basically place the return; statement back in and add two free commands). Because this is a bug that was in the project spec, error checking will no longer be tested on mv.

FINAL FINAL UPDATE: It seems that the mv spec for moving to directories is bad (can you see why? You can overwrite files!). Consequently we are not testing the mv with a path#5 path argument.

Read the entire project description before beginning this assignment.  While you may be able to understand the expected behavior of the project from the examples, you must refer to this document to ensure that you handle all commands correctly (especially corner cases).  Much of this project involves conforming to specifications.

Update September 14 at 10:45pm:  Comments in the base directory.c  (relating to the locateItem() and resolvePath() ) have been updated to correct typos and to improve clarity about expected behavior.  Furthermore, the path and softlink sections of this document have been updated to further clarify how paths work (and provide more examples).  Requirements involving errors have been explained further, and requirements about modifications to directory.c are slightly looser.  Finally, as per a student's recommendation, I added a "to-do" summary. No nontrivial changes in the project behavioral requirements have been made, however.  All added text for this update is written in RED.

Update: September 15 at 12:30am: I neglected to mention the default WRITABLE and READABLE attributes of a file, softlink, and directory after being created.  They are now present in the description of the relevant commands.  I further clarified other command behavior.  All changes are in BROWN.

Update: September 15 at 8:15pm: The code now actually compiles on Solaris (I had made the fatal mistake of only testing on linux).  To do this, I had to modify directory.c and dirmain.c slightly.  The list of changes (in case you already started) are listed on my newsgroup post.  I have further clarified additional parts of the spec (in ORANGE).  I also made a slight error in the path description that suggested that "/" is not a valid path (it is).

Update: September 16 at 7:45pm: I had made an error with the PATH examples; these are corrected now. The directory where the project was located has also been corrected. More importantly,
YOU NOW HAVE A REFERENCE SOLUTION FOR THE PROJECT.  Run
'~cs61c/lib/proj1/refproj1'.  You should rely mostly on this spec though for project information - and not this reference oracle.
Also I have added a new softlink example to the examples.  Be careful to understand how softlink directories should work; their behavior is somewhat nonintuitive.
All changes to this spec are in GREEN.  

Update: September 20 at 3:15am:  The reference solution had a small error where the rmdir command wasn't recognized. This has now been fixed.  A comment suggesting an additional behavior for ls in directory.c was also removed.
I have added a few more hints about the project to this spec - and further clarified details about syntax checking, alphabetical ordering, effects of directory removal, and paths.
No behavioral changes have been made.
All changes to this spec are in PURPLE.

Update September 21 at 12:45am:

Aaron will have extra office hours tomorrow in the lab from 5-6:30pm.  Come there for last minute project questions.

Whoops! A student caught two bugs in the stub code.  I have updated the stub directory.c

To fix them manually, do the following:
In the moveEntry function of directory.c there is:    
else if (*
to == '\0' || strcmp(to,".") == 0 || strcmp(to,"..") == 0){

Change this to:

else if (strcmp(to,".") == 0 || strcmp(to,"..") == 0){


Furthermore, there is a memory leak in that same function:
                if (newParent != NULL){
                     printf ("%s: %s: %s\n", mv, oldTo, file_exists);
                    return;
                }
Add the following between the printf and the return statement:
free(oldFrom);
free(oldTo);

Since these are bugs in the stub code, we will avoid tests that would trigger either bug.


Also, there was a slight error in dirmain.c that would cause a segfault if you hit <ENTER> on program start-up.  dirmain.c has been updated.

Continuing on:
As mentioned in class today, the maximum score on this project is now a 25/20 (25% extra credit possible).  A variation, with modification described HERE, has been written - with the notable feature being the lack of softlinks (this makes the working directory processing much simpler).  If  you do not implement softlink support, you will receive 20/20 (note that nearly all softlink tests will involve softlinked directories with paths including '..' - meaning it is pointless to have only limited softlink support (for instance only supporting softlinks to text files)).

I highly recommend you take on this challenge.  It will teach you much about data structures and pointers.  And in terms of grades, 5% extra credit on a project is 1.25% extra credit in the course.  That is equal to about 4 labs - and equal to the maximum possible lab extra credit.

I have added yet another explanation of the path system, this time in terms of a stack.  I also clarified moving to softlinked directories.  Note that
this only applies to the extra credit portion.
More importantly, some details about commands have been clarified.
All changes to this spec are in TEAL (I'm close to running out of colors!).
I also fixed a very minor error about path commands, which probably no one noticed.

Update September 21 at 9:45pm
Some things I wanted to clarify:

Due to slight ambiguity in the spec, we will not be testing the behavior of any command (other than cd) where the path ends in ".." or ".".

Furthermore, we are not testing the behavior of commands other than cd and ln where the pathname ends with a '/' (path#5).  And for ln, we will only test pathnames ending with a '/' for the target (you do not need to worry about the linkname)
.

Also (for softlink support only), I wanted to clarify that ln /a/b/ foo creates a softlink with a target to the directory a/b/. Your project must be able to support this softlink.
Also, I fixed a small error in the diagram for the simple project.  Also fixed some errors in some formal definitions and enhanced the stack explaination of paths.
Oh ya, since I'm out of colors, this should be the last update. =p

UPDATE AGAIN: YOU WILL NOW SUBMIT EITHER PROJECT WITH "SUBMIT PROJ1". BUT BE SURE TO NOTE IN YOUR README WHETHER YOU SUPPORT SOFTLINKS!  PLEASE RE-READ THE ABOVE UPDATE!

Goals

This project is intended to give you substantial practice with C pointers, linked structures, C strings, bit masks, and the use of malloc and free.

Background

Most of you have had experience working with a UNIX-like operating system. Such operating systems provide a tree-structured file system and operations for navigating around it. In this project, you will work with a command interpreter, implementing commands that create, delete, and traverse nodes in a tree that vaguely simulates a UNIX directory.1

As you complete the commands, be sure that you consult this project specification - and only this project specification - to determine what correct behavior is.  Most of the commands you will implement behave somewhat differently than UNIX shells do.

File System Representation

The nodes in the file-system tree represent text files, directory files, and softlinks. A directory file contains zero or more text files and zero or more directory files. A text file contains characters, and corresponds to a leaf in the tree. (An empty directory also corresponds to a leaf.) A soft link is much like a windows shortcut; it is also a leaf in a tree, but rather than containing content, it contains a path that describes where its "target" is.  The root of the tree represents the root directory in the file system. 
The nodes are implemented by the entryNode structure.  This structure is defined (in directory.c) as follows:

struct entryNode {
    char * name;
    struct entryNode * next;    
    uint16_t attributes;

    union {
      char * contents; /*for files*/
      struct entryNode * entryList; /*for directories*/
      char * shortcutReference;  /*for softlinks*/
    } entry;
};

The name of the corresponding file (that is, a pointer to the file name's first character) is stored in the name element.  File names can only consist of alphanumeric and the '.' characters.
A pointer to another file in the same directory is stored in the next element (of course this can be NULL as it is used to create a linked list of entryNodes).  
The attributes field specifies various attributes2 that the node possesses.  Actual attributes are indicated by individual bits being high (1) or low (0).  
The attributes are (bit 0 is least significant bit; bit 15 is most significant bit):
Bit Attribute Name Description
0 READABLE Only applies to text files.
If this bit is high, the file may be read from; if low, the file cannot be read.
1 WRITABLE Only applies to text files.
If this bit is high, the file may be written to or deleted; if low, the file cannot be written to or deleted.
14 SOFTLINK If this bit is high, this node is a softlink.
15 DIRECTORY If this bit is high, this node is a directory.
Note: If both bits 14 and 15 are low, the node is a text file.  It is illegal for both bits to be high.  Bits 2 through 13 are unused.
Finally, there is the entry field.  If the file is a text file, entry.contents refers to the contents of a file (implemented as a C string).  If the file is a directory, entry.entryList is a pointer to the parent node of a linked list of files within the directory.  If the file is a soft link, entry.shortcutReference will be a C string that specifies a path (see the Navigation description below) to the soft link's target.

A sample directory (without softlinks):

Its implementation (NOTE: The 0 and 1 indicates bit 15 of attributes being low or high, respectively):

Drawing of Project

Navigation

Just as in UNIX (as well as MSDOS), the cd (change directory) command is used to move through the directory structure.  The cd command accepts a single argument, a path.  If the path specifies a valid directory, the working directory (the current directory one is navigating) will change.

Paths

A path is defined as a C string which consists of a list of entryNode names delimited by the '/' character.  If the first character of the path is the '/' character, the path is an absolute path; otherwise it is a relative path.

Paths are nothing more than a way to refer to nodes (most of you should be familiar with shell paths).  For instance, in the above sample directory, "/d/f" will refer to file f and "/b" refers to directory b.  These two examples are both absolute paths - in that they specify a location "absolutely" (from the root directory); they refer to the same place regardless of where the current working directory is.  Relative paths, however, refer to a location "relative" to the working directory.  For instance, if the working directory is the root, "d/f" refers to file f and "/b" refers to directory b.  But if the working directory is "d", then "f" refers to file f and "../b" refers to b.

Valid paths thus can take the following forms:
1) node1   (relative path to single node)  EXAMPLE: "b"
2) /node1  (absolute path to a single node) EXAMPLE: "/b"
3) node1/node2/.../nodeI/.../nodeN (relative path to a nested node; N>=2) EXAMPLES: "d/f" and "d/e/i"
4) /node1/node2/.../nodeI/.../nodeN (absolute path to a nested node; N>=2) EXAMPLE: "/d/f" and "/d/e/i"
5) Any of the above with a "/" appended to the end.
6) / - This (/) is the path which references the root directory.  It is an absolute path.

Let node0 be the root directory in the case of an absolute path and the working directory in the case of a relative path.

NodeI (node1 in paths #1 and #2) is valid if it is:
1) A valid file name contained within the directory specified by Node(I-1). (This implies that Node(I-1) must be a directory.)  File names are case-sensitive.
2) It is the string ".".  The string "." refers to the same directory as Node(I-1) (That is "/./b/." refers to the same directory as "/b")
3) It is the string "..".  The string ".." refers to the same directory as Node(I-2).  NOTE: Node(-1) is the root directory for absolute paths and the parent of the working directory for relative paths (see below).

Bear in mind that the above "validity" definition is recursive.  To help clarify it, here is a sample of valid and invalid absolute paths given the above directory structure.

Valid:
/c                         Refers to file c
/d/f/../e                Refers to directory e
/.././..                    Refers to the root
/d/e/./i/..                Refers to directory e
/                                    Refers to the root directory

Invalid:

/e
                        e does not exist in the root
/d/../f/e                 f does not exist in the root
/d/h                     h does not exist within d
/d/e/h/../f             f does not exist within e

Undefined:
//                          You are free to decide what two '/' characters in a row means (but your program should not crash!).  Our solution treats it as a single '/'.

Small definition: The suffix of a path refers to nodeN (the last node listed).  If a path ends with a '/', it possesses no suffix.

IN A PATH,, EVERYTHING BEFORE A '/' MUST BE A DIRECTORY.  As an example, for the path /e/f/, e AND  f must BOTH be directories!

REMEMBER, ALL PATH NAMES ARE CASE SENSTIVE.  THAT IS "/A" AND "/a" refer to different nodes.

SoftLinks

The entry.shortcutReference property of a softlink node is an absolute path to the "target" -the actual node to access when using certain commands (described below) - of the softlink.  Note that the existance of softlinks complicates path processing: If a node is a softlink, it must be "resolved" - that is if node(I-1) is a softlink, the shortcutReference of node(I-1) must be used to test for the presence of nodeI.  Warning: This is recursive.  If the shortcutRefence refers to a softlink, that softlink must also be resolved.  This process continues until a non-softlink is found - or there is an error.  
If the shortcutReference does not refer to a valid location, then the softlink itself should be considered an invalid path (again, recursion is wonderful).

Very Important: It is the name of the softlink - not its target - must be displayed with the pwd command (see command descriptions below).

Note that ".." still refers to node(I-2).  When changing directories, cd .. changes the working directory to the directory listed before the current working directory (node I-2), as indicated by the pwd command.  In particular, it does NOT always refer to the parent directory of the current working directory.

As an example, let us place a softlink name LINK in the root directory with a shortcutReference to /d/e
Immmediately note that that /LINK/h and /d/e/h refer to the same files.

Consider the following command sequence:

cd /d/e
The working directory is now the "/d/e" directory
cd ..
The working directory is now the "d" directory.

But consider an alternative sequence:

cd /LINK
The working directory will be printed as "/LINK" by pwd, but the working directory is effectively /d/f (that is "g" is a valid filename path at this point)
cd ..
The working directory is now the root directory. 3


Be sure to check the examples at the end of this specification for more information on how soft links work.

VERY IMPORTANT:
Consequently, ".." may refer to different directories, depending on how the user arrivied at the current directory.  Thus to resolve ".." you cannot simply jump to the entryNode that is the parent of the current directories.  Rather, you will need to have some sort of stack data structure that you push to and pop from as directories are traversed.

An Algorithmic (and less formal) Definition/Example of the Paths and Softlinks

For any command defined below that takes a single path argument, e.x. cmd [/]dir1/dir2/.../dirN/file where n>=0
(Note: file may be "")

The behavior of that cmd is the same as the sequence of commands:

oldWorkingDirectory := Current Working Directory
cd [/]dir1
cd dir2
....
cd dirN
cmd file
(If any error occurs after a cd (e.g. a directory does not exist), goto end)

end:
Current Working Directory := oldWorkingDirectory (This step does not happen obviously if cmd is "cd")

For instance create TAs/CS61C/David would create a file named David in the TAs/CS61C directory, provided that the TAs and CS61C directories already exist.

If the Current Working Directory is dir1/dir2/dir3/../dirN, entering cd .. will change the working directory to:
dir1/dir2/dir3/.../dirN-1

"cd ." has no effect.
cd .. while the working directory is the root directory will have no effect.

Finally, note that dir1, dir2, dir3, etc. may be the names of softlinks, rather than directory nodes. (see the pwd command for clarification).

Yet another Explanation of Paths and working directories and softlinks

Hopefully, this helps clear up some confusion.

View your working directory as a stack.  For instance, lets say you are in /d

Then the stack would be:

d

If you do a cd directory command, you push that directory name onto that stack.  
For instance, cd e will produce the following stack:

e
d

Now the "pwd" command is nothing more than a printout from the bottom to the top of the stack.
So with the above stack, pwd would return /d/e

cd . does nothing to the stack.  In fact that command in general does nothing.

cd.. pops the top element off of the stack.  So with the above stack, cd.. would produce:

d

and another cd .. would empty the stack (and pwd would be just "/").

Incidently if you use an absolute path, you clear the entire stack and then push items.  For instance, "cd /b"
produces the following stack

b

Regarldess of what the stack was before that command.


Now keeping this stack definition in mind, along with the pathname to CD algorithm above, you hopefully understand how paths work! :)


One more thing to consider:

Let's say the stack again is

e
d

And let's say within e is a softlink LINK which points to /b (a directory).

Then if we do cd LINK, the stack will become

LINK
e
d

(pwd is /d/e/LINK)

HOWEVER, bear in mind that the working directory is ACTUALLY /b!  Why? Because that is what LINK's target is!

And if we do cd .. again, the stack returns to:

e
d

Huzzah!

Commands

The commands to be supported by the interpreter are now presented. They are simpler than their counterparts in the standard UNIX shells (bash, tcsh, etc.); as noted above, there are only three kinds of files and no command options (normally specified using "-") are allowed.

Some terms used in the table:
Path#5 refers to the 5th path format described above, where a path ends in the '/' character.
Alphabetical order is defined as the lexicographic ordering that the strcmp function uses.
File is synonomous with node.  File does not necessarily mean "text file".

Be sure to resolve all path expressions (see SoftLink section).

The table also lists errors strings that that must be outputted if an error occurs.   They are (copied from directory.c):
const char * file_exists = "File already exists";
const char * no_such_file = "No such file or directory.";
const char * not_a_directory = "Not a directory.";
const char * is_a_directory = "Is a directory.";
const char * directory_not_empty = "Directory not empty.";
const char * operation_not_permitted = "Operation not permitted.";
const char * links_must_be_absolute = "soft link targets must be absolute";

To print the above errors, you must use the following C code:
printf ("%s: %s: %s\n", cmd, failed_argument, error_message);
Example:
printf ("%s: %s: %s\n", rm, fileName, no_such_file);

Be sure to see the definitions of the cmd names in dirmain.c  Most of the variables have standard names; unfortunately, since mkdir and rmdir exist in C libraries already, they have been called "mk_dir" and "rm_dir", respectively.

You may not use equivalent code that will output the same message.  The autograder will be intercepting all calls to printf() and will verify the correctness of the forth argument (the error_message).

The autograder will verify that the correct error was outputted.  What you put in "failed_argument" is up to you, but it should be something reasonable.

For errors listed as "other", you are free use any error_message (but you must use that printf code).  Also you can output any error_message when the path argument after being stripped of its suffix, is invalid, unless stated otherwise by the command description.

A command may never create a file named "..", ".", or "".  Think about where this can occur - and stop it.  (again, you can use whatever error message you wish).  For instance, the command "mv file .." must not generate a file named "..".  It may error or it can try to move file to "..".

Finally, to allow for flexibility, you may use any error_message if there is an error and the path argument ends in "..", ".", or "/" (unless stated otherwise).

 command

# of arguments

 Behavior

SoftLink considerations Errors to output
pwd

none

Prints the full path name of the working directory.  What is printed out must be a valid absolute path.  The path must NOT end with a "/" character.

ex:  
/dir1/dir2

If the user cd's into a softlink (call it LINK) while in some working directory (call it x), pwd should display:
x/LINK
That is softlinks in the path  that is printed out should not show the resolved path.
N/A
cd

1 (path)

Changes the working directory to the path (absolute or relative) to that of the argument.

Recall that ".." from the root directory is still the root directory.

If path refers to a softlink, the working directory must become the target of the softlink.  However, as noted above, pwd will list the softlink's name, not the target's. no_such_file:
-
path does not refer to an existing node.
not_a_directory:
-
path refers to a valid node, but that node (or the node's target) is a text file.

mkdir

1 (path)

Creates a directory with the suffix of path inside the directory specified by the path with the suffix removed.  This can create files inside other directories; mkdir dir1/dir2 is valid - and will create dir2 - if and only if dir1 already exists.

The new directory's READABLE and WRITABLE bit attributes are irrelevant, but because the ls command must show both being set, you may want to set both bits to be true.

NOTE: mkdir dir1/dir2/ is technically always invalid (since it would create the directory "" inside dir1/dir2).  However we are not testing this behavior.

No special consideration, other than path resolving. no_such_file:
-
path implies that a directory should be created inside a non-existant directory.
file_exists:
-
directory exists
ls

0 or 1 (wildcard_exp)

If given without arguments, prints the names of files in the working directory in alphabetical order.

If given an argument, the argument is treated as a "wildcard_exp".  A wildcard_exp is much like an ordinary filename, except it is allowed to have the character '?' in it.  If this type of argument is used, the files within the working directory should be listed in alphabetical order, provided that their name matches wildcard_exp.  A file name matches wildcard_exp if it is the same length and character i (0<=i<length) of the filename is the same as character i of the wildcard_exp OR character i of the wildcard_exp is a '?'.  Thus, if we have the wildcard_exp of "fo?", the files "foo" and "fos" match, but "fun" does not.

A text file is printed as follows:
-rw filename
Note that r and w refer to readable and writable respectively.
So if the file is readable, but not writable, the following should be printed:
-r- filename

A directory is printed as follows:
drw directoryname
(The drw is always present when showing directories.  One is not allowed to change the READABLE and WRITABLE attributes of directories.)

See the next column for how softlinks should be printed.

If no files by wildcard_exp are matched the error "no_such_file" must be printed.  Only print this error when a wildcard_exp is given in the arguments.

Note: This is the only function which cannot accept path arguments. dirmain.c will filter these out.

NOTE2: It may be easier to keep directories' entryLists always in alphabetical order than to sort lists every time ls is invoked.

Note3: The alphabetical ordering used by strcmp may be non-intuitive.  Specifically any capital letter comes before a lower case one.  Example: a proper sorting is "A", "B", "a", "b".  

Softlinks are printed as follows:

atr softlink -> target

Replace atr with the attributes that the target possesses (if the target is a softlink, resolve targets until a file is found).

For instance,

drw softlink -> directory
--w softlink -> /afile

If the link cannot be resolved, print

INV softlink -> /afile
no_such_file:
-
No file matched the wildcard expression

rmdir

1 (path)

Removes the directory specified by path. It is an error if the directory is not present.

Do not remove the softlink or the softlink's target.  It is an error to execute this command on a softlink. no_such_file:
-
path does not refer to an existing node.
not_a_directory:
-
path refers to a valid node, but that node is a text file or softlink.
directory_not_empty:
-
The directory contains files
rm

1 (path)

Removes the text file specified by path. It is an error if the text/softlink file is not present.

Remove the softlink, NOT the softlink's target.  Remove the softlink regardless of what its target is. no_such_file:
-
path does not refer to an existing node.
is_a_directory:
-
path refers to a valid node, but that node is a directory
operation_not_permitted:
-
text file is not writable
create

1 (path)

Creates a "text file" with the suffix of path inside the directory specified by the path with the suffix removed. It then reads the contents of the file from standard input. The working directory must not already contain a file with the given name.

Standard input is indicated as completed by two consecutive carriage returns ('\n').  The first carriage return will go in the text file; the second will not.

Do not assume any bound on the size of input.

The newly created file will be both READABLE and WRITABLE.

e.x. create TAs/Sameer
creates a file named "Sameer" in the directory TAs if and only if the TAs directory exists.

NOTE: create TAs/Sameer/ is technically always invalid (since it would create the file "" inside TAs/Sameer).  However we are not testing this behavior.

N/A no_such_file:
-
path implies that the file should be created inside a non-existant directory.
file_exists:
-
node with the same name already exists.
append

1 (path)

Appends to the contents of the text file specified by path with data read from standard input. 

Standard input is indicated as completed by two consecutive carriage returns ('\n').  The first carriage return will go in the text file; the second will not.

Do not assume any bound on the size of input.

If the path refers to a softlink, append to the target of the softlink.
no_such_file:
-
path refers to a nonexistent file
is_a_directory:
-
path refers to a directory
operation_not_permitted:
-
text file is not writable
cat

≥ 1 (path)

Prints the contents of the text files named by the path arguments.

NOTE: You will be implementing a function that only takes 1 argument (see the code in dirmain.c).  The behavior when an error occurs with 2+ arguments is undefined; we will only be testing cat with 1 argument.

If the path refers to a softlink, read from the target of the softlink. no_such_file:
-
path refers to a nonexistent file
is_a_directory:
-
path refers to a directory
operation_not_permitted:
-
text file is not readable.
ln

2 (path target, ansolute path linkname)

Creates a softlink named the suffix of "linkname" (same path rules as create) with a shortcutReference of "target".  In particular, the target should not be validated or resolved (other than checking that it is an absolute path): It is allowed to create an invalid link.

The new softlink's READABLE and WRITABLE bit attributes are irrelevant, but because rm be able to delete the file, you may want to set at least WRITABLE to true.  (NOTE: The code we gave you sets READABLE and WRITABLE both to true.)

Other than path resolving, none. no_such_file:
-
path implies that the link should be created inside a non-existent directory.
file_exists:
-
node with the same name already exists.
links_must_be_absolute:
-
linkname is a relative path.
(Note: As indicated in the "experts" section, you are allowed to add relative softlink support  - consequently we will not be testing this error)
mv 2 (path source, path destination) There are two possible behaviors depending on the format of destination:
If destination is a path#5 (you can optionally use this behavior if the path suffix is "." or ".."):
-Moves the text file OR softlink  referenced by source into the directory referred to by the destination.
Otherwise:
-Moves the text file OR softlink referenced by source into the directory referred to by the path with its suffix stripped. Then it renames the file to be the suffix of the path.
If the source path refers to a softlink, move the softlink itself.  Do not update the softlink's shortcutReference variable (the string or the actual target entryNode).

If the destination is a softlink to a directory, move to the file to that directory.
no_such_file:
-
path refers to a nonexistent file
file_exists:
-
the destination already exists (only possible if the destination is not a path #5)
is_a_directory:
-
source refers to a directory (Note: As indicated in the "experts" section, you are allowed to add directory moving functionality  - consequently we will not be testing this error)
setread 2 (int, path) Set the READABLE attribute bit of the "text file" referred to by path to be the integer argument (which will only be 1 or 0).
If the path refers to a softlink, do this operation on the target of the softlink (again, this is recursive) no_such_file:
-
path refers to a nonexistent file
is_a_directory:
-
path refers to a directory
setwrite 2 (int, path) Set the WRITABLE attribute bit of the "text file" referred to by path to be the integer argument (which will only be 1 or 0).
If the path refers to a softlink, do this operation on the target of the softlink (again, this is recursive) no_such_file:
-
path refers to a nonexistent file
is_a_directory:
-
path refers to a directory

Table of commands to be supported by the project 1 command interpreter

The Project

The directory ~cs61c/lib/proj1 contains four files that you should copy to your own directory: a Makefile, dirmain.c , directory.h , and directory.c .  The main program in dirmain.c repeatedly reads a command from standard input, checks that it is syntactically correct, and if so, calls one of the functions in directory.c to execute the command. The file directory.h contains declarations for functions that access and modify the directory tree. The file directory.c contains the actual definitions of these functions, as well as a struct entryNode declaration that represents a file (directory or text file). It also provides near-complete implementations for cat, ln, and mv.  Your job is to provide an implementation of all commands (that is fill in the YOUR CODE HERE sections).  You need only submit your directory.c file (as well as a README which is described below). Changes to other files will not be accepted.

Additionally, you must initialize the file system.  Set root to be a pointer to a directory entryNode with name = "/", no siblings, and no children.

Note: What this project is NOT is doing syntax checking.  dirmain.c ensures that all path inputs are valid path names - and that numeric arguments are within the proper range.  You simply do "semantic checking" (e.g. does this path refer to a valid node?)

Helper Functions

Note that we have provided you with basic code in all commands, using two helper functions (that have not yet been implemented): locateItem and resolvePath.  It is highly recommended that you implement these functions to behave in the way described by the comments above them.  The two are naturally recursive (in our solution, they actually call each other).  However, bear in mind that you can do whatever you want to directory.c, except that you must use the entryNode structure (and not modify) to represent the directory tree.   If you would rather not use these helper functions, feel free to delete them and the code that invokes them.

UPDATE: You are permitted to modify the entryNode structure (or even not use it), provided that you email the Project TA in charge explaining what you desire doing and why you feel that change is advantageous (there must be a significant speed or space advantage) - and the TA must approve your change.  This requirement is necessary because the autograder will be tracking your program's memory usage (to verify that memory leaks to not exist); If your data structures change, your program's potentially different memory usage will need to be accommodated.


Memory Requirements
All memory allocated, other than memory allocated in the initialFileSystem() function, must be free'd once it is no longer accessible.  In other words, if the following command sequence is done a few hundred million times in a row, your program should not crash.
mkdir dir1
cd dir1
create file1
cd ..
rm dir1/file1
rmdir dir1

Furthermore, it is always possible that malloc() and friends will return NULL.  If this occurs, you must print the out_of_memory error that is defined in directory.c and exit(1).  We have provided a SafeMalloc() function within directory.c that "wraps" the malloc() command; be sure to write something similar if you decide to use realloc() or calloc()!

In Summary - A list of what you need to do

We have provided a great deal of the base code.  Your task is to edit directory.c by doing the following (in no particular order):

-Finish the code for the following commands:
    create - in createFile
    append - in appendfile
    mkdir - in createDir
    cd - in newWorkingDir
    ln (almost completed already) - in createSoftLink
    rm - in removeFile
    rmdir - in removeDir
    mv (almost completed already) - in moveEntry
    pwd - in printWorkingDir
    ls - in listWorkingDir and listWithinWorkingDir
    setRead - in modifyReadable
    setWrite - in modifyWritable
-Design the data structure to keep track of working directories (needed for locateItem, resolvePath, cd, and pwd)
-Add any helper functions you may need to successfully implement the before mentioned commands.
-Correctly initialize the file system, as well as your working Directory structure - in initialFileSystem
-Implement the helper functions:

    locateItem
    resolvePath

Also be sure to write your README file
and note whether you support softlinks in it.
     

Hints - and common mistakes 4

1) Bear in mind that all char * arguments being passed to your functions are coming off the stack.  Needless to say, it is a very bad idea to make your dynamically allocated entryNodes have any pointers that point to memory that is on the stack.
2) Calling free() on a pointer that was not returned by malloc() results in rather strange crashes that appear untraceable.  Using resolvePath() carelessly can cause this (why?).
3) The strcmp function doesn't like null strings being passed as arguments.
4) A doubly linked list may be a good way to store the working directory stack.  Why doubly linked? Because it can act as a queue as well (where would this be helpful?).
5) Start early.  Write a function a day (at least).  The TA solution features about 350 lines of rather hairy C code (and hairy code takes longer to write than shaven code).
6) When you think you are done, you probably aren't.  Many bugs were found in the TA solution well after the TA thought he had finished the project. TEST WELL.
7)  A hint about resolvePATH: You are allowed to modify **pathName. In other words, the function can modify both the pointer to the character array and the character array itself.  You can utilize such modification to split strings.  For instance, if you have the string
char * f = "foo-bar"
you can set f[3] = '\0';
and you will have two strings:
f is now "foo"
and (f+4) is "bar"
8) That being said, don't let resolvePATH bottleneck you.  You can still much partial credit even if softlinks do not work correctly and paths do not work with a '/' in them.  You may want to implement an incomplete resolvePath first, complete all other functions, and then finish up resolvePath.

Project Subtleties - Read carefully

Because we tried to keep this project from becoming insanely difficult (we preferred just good 'ol difficult), there are certain "bizarre" behaviors present if you follow this specification correctly:

1) Behavior of circular soft links is undefined.  Do not worry about this.
It is possible to create circular soft links with the following command sequence:
ln /link1 /link2
ln /link2 /link1
All sorts of commands in this project require you to resolve those links.  Needless you say, there is a problem with attempting this.  
We will not test circular soft links.  You are even allowed to crash if this happens.

2) It is possible to delete a softlink - or directory - that is part of the current working directory

Consider the following:

cd /
mkdir D
cd D
rmdir ../D
Your working directory is still /D even though /D no longer exists.


Also Consider the following:
ln . link
cd link
rm link
The working directory will still be /link even though link no longer exists.  

We will not test your behavior in either case, meaning crashing is allowed.


Are you an Expert?

Finished this project in only a few hours?  Want to do more work? Well, you can!
Some ideas on what you can do:

1) Make circular links not crash.  This is pretty hard, since you are not allow to verify this at softlink creation (why?).
2) Support mv for directories.  This can get rather hairy as well, because it risks affecting the current directory.
3) Support relative path soft link targets.  The project does not require you to support these because of the large amount of ambiguities generated (not to mention significantly harder code).  But if you think you have what it takes, go for it (this is the hardest of all 3).

If you do any of the above, be sure to note it in your README.

Examples

Click here to view annotated example commands - and expected output.  Be sure your code has the same output as the example.

Be sure to test your program against the reference solution, which you can run by typing ~cs61c/lib/proj1/refproj1 at the command prompt.  You must be on a SPARC machine (nova, h50, h30, quasar, pulsar, etc.) to run the program.  Also beware that your program does not have to perfectly conform to the reference example (e.g. error messages can differ where the spec allows) - so be sure to rely on this spec and NOT the reference solution.

Submission

Submit your solution with the submit proj1 command.
You must include directory.c, which contains your modifications to directory.c, as well as a README file, which should discuss design decisions you made, explain your choice of data structures, and if you like, provide brief comments on any problems you encountered and how you overcame them.  ALSO BE SURE TO NOTE WHETHER YOUR PROGRAM SUPPORTS SOFTLINKS!

Disclaimer

This is the first time this project has ever been given.  While the author has tried to ensure that there are no ambiguities in the specification, he may (probably) have made an oversight.  If you are confused about anything, be sure to ask on the newsgroup.

GOOD LUCK! 


1. An actual UNIX file system is organized somewhat differently; see The UNIX Programming Environment , by Brian Kernighan and Rob Pike (Prentice-Hall, 1984), for further information.

2. See UNIX Unleashed, System Administrator's Edition to view information about how file attributes actually work in UNIX-like operating systems

3. In case you were wondering, this is how the bash shell works.  The tcsh shell (the default on the instructional machines), however, will make the working directory "d" after the cd..  Other shells probably even have different behavior.  Everyone loves UNIX's amazing standardization.

4. Strictly speaking, since this is the first time this project has ever been given, no mistake can yet be called "common" and no question could have been asked "frequently".