CS 162 Lecture Notes for Monday 4/4/05 Announcement: Midterm2 on Wednesday 4/6/05 (closed books, closed notes, open mind) + File Descriptor: File Descriptor is a data structure or record that describes the file. Information stored in the file descriptor has to be stored on a permanent storage (like disk) so that it does not get lost when the computer system is shut down. + Descriptors in Unix: + Descriptors are stored in a fixed size array on disk and a special area of disk is used to store the descriptor array to make lookup simple. + The size of the descriptor array is determined when the disk is initialized, and can not be changed. + Descriptor is called an inode in unix, (index node and its index in the array is called its i-number). Internally, the OS uses the i-number to refer to the file. + Inode fields: + reference count (number of times file has been opened at given time) + number of (hard)links to file + owner's user id, owner's group id + number of bytes in file (size of file) + time last accessed, time last modified, last time inode changed + Disk block addresses + flags: (inode is locked, file has been modified, some process waiting on lock) + file mode: (type of file: character special, directo- ry, block special, regular, symbolic link, socket), + Socket: A socket is an endpoint of a communication, re- ferred to by a descriptor, just like a file or a pipe. Two processes can each create a socket and then connect those two endpoints to produce a re- liable byte stream. + Difference Between a pipe and a socket: a pipe requires a common parent process. a socket does not, and the processes may be on different machines. + protection info: (set user id on execution, set group id on execution, read, write, execute permissions) + count of shared locks on inode + count of exclusive locks on inode + unique identifier + file sys associated with this inode + quota structure controlling this file + When a file is open, its descriptor information is kept in main memory in form of tables. + Per Process Open File Table: The integer entry into that table is the handle for that file open. Multiple opens for the file will get multiple entries. + System Open File Table: Unix also has a system open file table, which points to the inode for the file in the inode table described below. It’s a system wide table and maps names to files. It is used to quickly locate inodes of recently used file in inode table. + Inode Table: There is also the inode table, which is a system-wide table holding active and recently used inodes. This serves like a cache for the descriptor array and makes access to recently used inodes fast. + Directories: Users want to use text names to refer to files. Special disk structures called directories are used to tell what descriptor indices correspond to what names. Now we discuss multiple ways of organizing the directory structure. + Approach #1: have a single directory for the whole disk. Use a special area of disk to hold the directory. Directory contains pairs. (/) / \ / \ (cs162officialSol.txt) (MySol.txt) + Problems: + If one user uses a name, no-one else can. + All the files are in the same directory, so if you can't remember the name of a file, there is a long list of files to search from. + Security problem - people can see your file names. For example, drugDealers.txt + Approach #2: have a separate directory for each user. Hence every user has his/her own directory which has contains all the files belonging to that user. (/) / \…… (/smith) / \ / \ (research.txt) (cs162-lec1.txt) + Problems + Users have cluttered home directory. + No way for user to organize the files. + Searching for a file still takes a long time if a user has a descent number of files. + #3 - Unix approach: generalize the directory structure to a tree and store them on disk just like regular files. Each directory contains pairs. The file pointed to by the index can be another directory. Hence, we get hierarchical tree structure. Names have slashes separating the levels of the tree. (/) / \ …. (/smith) . / \ / \ (/Research) (/cs162) / \ / \ (README.txt) (/notes) + Problems + Relatively complex directory structure. + Multiple references required to descriptor array required to move down the hierarchy. + Root: We need a place to start so we make a base directory and call it ‘root’. This directory has no text name, and is always the file pointed by inode# 2 to make lookup fast. + Hard Links: Each pointer from a directory to a file is called a hard link. File can be removed by removing all the hard links to it. + Symbolic Links: Instead of pointing to the file or directory, we have a symbolic name for that file or directory. + Working directory: It is cumbersome constantly to have to specify the full path name for all files. Therefore in Unix, there is one directory per process, called the working directory, which the system remembers. + How to see current working directory: ‘pwd’ + Search Path: Every user has a search path, which is a list of Directories in which to look to resolve a file name. + How to see value of search path: ‘echo $PATH’ + Operations on Files + Open - put a file descriptor into your table of open files. Those are the files that you can use. May re- quire that locks be set, and a user count be incremented. + Close - inverse of open. + Create a file - sometimes done automatically by open. + Remove (rm) or erase - drop the link to the file. Put the blocks back on the free list if this is the last hard link. + Read - read a record from the file. (This usually means that there is an "access method" - i.e. I/O code - which deals with the user in terms of records, and the device in terms of physical blocks). + Write - like read, but may also require disk space allo- cation. + Rename ("mv" or "move") - rename the file. Unix combines two different operations here. Rename would strictly in- volve changing the file name within the same directory. "move" moves the file from one directory to another. Unix does both with one command which can be harmful Because ‘mv’ destroys old file if there is one with new name. + Seek - move to a given location in the file. + Synch - write blocks of file from disk cache back to disk + Change properties (e.g. protection info, owner) + Link - add a link to a file + Lock & Unlock - lock/unlock the file. + Partial Erase (truncate) + Commands like "copy", "cat", etc., are built out of the simpler commands listed above.