CS162 Spring 2005 Lecture Notes 2005-03-28 prepared by Radu Damian Topic: Finish talking about I/O Systems --------------------------------------- Device Interconnection Design differences between small systems and mainframes result result in different types of device interconnections. - In small systems (PCs), everything is attached to the bus. The CPU connects dirrectly to the device contraller which drives the devices. The controller may also be built into the device. - Multiple devices can be attached to SCSI bus - Mainframes have a more complicated arhitecture, designed for data storage. - In IBM mainframes there are channels which are independent I/O Processor. Channels execute commands and connect to storage control units. Storage controllers connect to string controllers. String controllers have a number of disks on them. This tree structure resulted in bottle-necks. So multiple path were built from CPU to disk. - devices can be shared among CPUs at the level of the storage controller or string controller. Interconnection scheme of Mainframe MCU = Multidevice Control Unit ------------------------------------- --------------------------- | CPU1 | | CPU2 | | | | | | --------- --------- -------------| | ---------- ---------- | | |Channel| |Channel| | Channel || | | Channel| | Channel| | | | A | | B | | to channel|| | | C | | D | | | --------- --------- | adapter || | ---------- ---------- | | | | -------------| | | | | | | | | | | | | | | | | | |--------- | | | ------|-------------|------ | | | | | | | ------|----------------------|--|---- | | | | | | | ------- | -------------- ----- | MCU | ----- |MCU| ------- |MCU| ----- \ ----- | \------- | --------------------- \ | / ----------------------- | Shared Switch | ----------------------- / | | \ \ -- | | \ -------- / | | \ \ ----- ----- ----- ----- ----- |I/O| |I/O| |I/O| |I/O| |I/O| ----- ----- ----- ----- ----- NAS and SAN NAS = network attached storage. SAN = storage area network. NAS - storage attached to local area network (i.e. ethernet). - Provides "file" inter-face. Low to midrange product. - low to midrange level product. Question: is NAS just a dumb file server? Answer: Yes. SAN - separate network containing storage. - mid-high end product. ---------- --------- ---------- | Server | | Server| | Server | ---------- --------- ---------- | | | \ | / -------------------------- | | | | -------------------------- | -------- | Disk | -------- Storage networking Industry Association (SNIA) works on standars like NAS and SAN. Storage Service Proviers - 3rd parties that provide storage space either through the internet or via dedicated cables to providers. Expensive. Direct attach storage - storage unit directly attached to a server: Server <--> Disk Server <--> Disk Server <--> Disk Other types of storage: SCSi over IP Question: Why do mainframe and PC arhitectures differ? Answer: Evolution. Each designed to fill a need. Different technology available at the time. PC design is driven by cost (cheap), no data processing, and no reliability. See Reader For more illustrations and information on "The Evolution of storage systems". page 162. Joke: Professor made a joke about IBM and GM. IBM has the best mechanical engineers. That is why GM makes bad cars. They couldn't hire any of them. Flash memory - flash memory can retain its contents indefinetly - slower than DRAM and SRAM - it has a limited number of read/write cycles (50,000 times) Question: Why so many standards of flash memory? Answer: 1. Different form factors for different devices (digital cameras, music players, etc...) 2. Each manufacturer wants its own technology so it can charge more. Question: Why does flash memory have a limited number of read/write cycles? Answer: Nor sure. Asside: Digital camera batteries also come in different sizes from each manufacturer so they can have a big markup on them. Another example: HP and their printers. They are really in the ink business. Disk drives in old computers had no buffering. Now every disk has a cache buffer. New Topic: File Structure, I/O Optimization -------------------------------- File - a named collection of bits (normally stored on disk). The file has different semantics from the point of view of the programmer or of the operating system's. - from the OS's point of view, the file is a bunch of blocks stored on disk. - from the programmers view, one may see a different interface (bytes or records), but this doesn't matter to the file system. Files have attributes and properties like: - name (s) - protection - type (numeric, alphabetic, binary, java program, data, etc...) - time - time of creation - time of last use - time of last modification - owner - length - link count - also there may be other additional properties. How do we use a file? There are 3 ways to use a file: - Sequential - Random Access - Keyed Sequential - information is processed in order, one piece after the other. - most used - examples: editor writes out a new file, compiler compiles the file, etc... Random Access - we can address any block in the file directly without passing through its predecessors. - we need to know what block we want, so we must have some sort of index. - from OS or database Keyed - look up block (record) based on a key (hash table, associative database, dictionary). - usually not provided by operating system (just in some IBM systems like IBM OS/360 which has keys associated with disk blocks. channels read the keys. - can be considered a form of random access. Modern file and I/O systems must address four general problems: 1. Disk Management (i.e. Layout and access). This consists of: - Efficient use of disk space - Fast access to files: - File Structures must be kept - Device use optimization - User has hardware independent view of the disk. 2. Naming (i.e. how do users refer to files?). - this concerns directories, links, etc... 3. Protection (since all users are not equal) - want to protect users from each other. - want to have files from various users on same disk. - want to permit controlled sharing. 4. Reliability - Information must last safely for long periods of time. Disk Management - disk management needs to answer a few questions like: how should the blocks of the file be placed on disk? and what kind of map do we need to find and access the blocks we look for? - the answer is to have file descriptors File Descriptor - a data structure that gives file attributes and containts the map which tells you were the blocks of your file are. - file descriptors are stored on disk, along with the files. Some system, user and file characteristics: - most files are small. - in unix most files are very small (for example lots of files with a few commands in them) - much of disk is allocated to large files - many of the I/O operations are made to large files - most of the I/Os are reads (60-85%) - most I/Os are sequential This means that disk management design should keep per-file cost low but large files must still have good performance. Asside: It is missleading to think that we know what I/O is happening on our own computers. Usually only 10-20% of I/O is user generated. Question: Why is there an upper limit to the size of a file? Answer: Because of the limited number of bits in the file descriptor. File Block Layout and Access (standard data structure, but on disk) - Contiguous - Linked - Indexed or tree structured Contiguous file allocation: - allocate the file in a contiguous set of blocks or tracks. - keep a free list of unused areas of the disk. When creating a file, make the user specify its length and allocate all the space at once. File Descriptor contains the location and size of the file. Advantages of Contiguous file allocation: - easy access both sequential and random - low overhead - simple - few seeks - very good performance for sequential access Drawbacks of Contiguous file allocation: - bad fragmentation will make large files impossible to store - hard to predict needs of a file at creation time - may over allocate - hard to enlarge files Contiguous allocation can be inproved by permitting files to be allocated in extents. This means asking for a contiguous block, but if it isn't enough, get another contiguous block and use pointers from the end of the first block to the next contiguous block. Example: IBM OS/360 permits up to 16 extents. Extra space in the last extent can be released after file is written. Linked file allocation: - link the blocks of the file together as a link list - store a pointer to first block in the file descriptor - each block of the file keeps a pointer to the next block Advantages of Linked file allocation: - files can be extended - no external fragmentation - sequential access is easy (follow the links) Drawbacks of Linked file allocation: - random access requires sequential access through list. - lots of seeking, even in sequential access - overhead in block for link Examples of systems that use(d) Linked file allocation: TOPS-10, Xerox Alto, both in a modified form. Indexed file allocation (simple) - the simplest approach is to keep an array of block pointers for each file. - the maximum length of th file must be declared when it is created. - allocate an array to hold pointers to all of the blocks, but don't allocate the blocks. then fill in the pointers dynamically using a free list Advantages of Indexed file allocation: - not as much space wasted by overpredicting - both sequential and random access are easy - only waste space in the index Drawbacks of Indexed file allocation: - may still have to set maximum file size (would have to have an overflow scheme if file is larger than predicted maximum) - blocks are probably allocated randomly over disk surface, so there will be lots of seeks - index array may be large and may require large file descriptor. Multi-level index file allocation - this is the VAX Unix solution - the BSD Unix 4.3 uses a multi-level tree structure (see image below). - file descriptors: - contain 15 block pointers - first 12 point to data blocks - the next 3 to indirect, doubly-indirect and triply-indirect blocks - each indirect block has 256 pointers - maximum file length is thus fixed, but large. - this also means that descriptor space isn't allocated until it is needed. Advantages of multi-level index: - simple - easy to implement - incremental expansion - easy access to small files - good random access to blocks - easy to insert a block in the middle of a file - easy to append to a file - small file map Drawbacks of multi-level index: - indirect mechanism doesn't provide very efficient access to large files which means 3 descriptor operations for each real operation. - file isn't generally allocated contiguously, so we have to seek between blocks. Unix multi-level indexed file ------------------------------------------------------------------ |1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | ------------------------------------------------|----|------------ | | | | -------------------------------- Indirect block |1 ..... ... 256| -------------------------------- -------------------------------------- ... 14 | 15 | ----|---------------------------|----- | | | | --------------------- ------------------ | | | | |...| | | | | | | | .... | | | Indirect block -|-----------------|- --------------|--- | ............. | ... | | | | ---------- ----------- --------------- | | ...| | | | ... | | | | | ... | | | Indirect block ---------- ----------- -------|------- ... | | --------------- | | | ... | | | --------------- Digression: Dec PDP 11 had 16 bit addresses. This resulted in file sizes being set and there was no way to make them bigger, since they had used every bit of address space. To be covered later: where do we start from. Something has to be a defined location and not an index. Block Allocation - if blocks are the same size, then we can use a bit map solution: - one bit per disk block - cache parts of the bit map in memory. select block at random (or not) from the bitmap - if blocks are of variable size, then we can use free lists (list of block groups so you can grab multiple blocks at a time) - this requires free storage area management. fragmentation and compaction. - in unix, blocks are grouped in groups for efficiency: - each block on the free list contains pointers to many free blocks plus a pointer to the next list block. this means that there aren't many references involved in allocation or deallocation. - block-by-block organization of free list means that file data gets spread around the disk. - a more efficient solution (which was used in DEMOS system built at Los Alamos): - allocate groups of sequential blocks. use multi-level index scheme described above, but each pointer isn't to one block, but to a sequence of blocks - when we need another block for a file, we attempt to allocate the next physical block on the track (or cylinder) - if we can't allocate it sequencially, we try to do it nearby - if we have detected a patter of sequential writing, then we grab a bunch of blocks at a time (release them if unused). size of the bunch will depend on how many sequential writes have occurred so far. - keep part of the disk unallocated always (as Unix does now), then probability we can find sequential blocks to allocate is high. End of Lecture -------------- Next time: I/O Optimization.