UNIVERSITY OF CALIFORNIA College of Engineering Department of Electrical Engineering and Computer Sciences Computer Science Division CS 162 Alan Jay Smith Topic: Networks and Communication Protocols Two trends: Lots of small machines. Lots of computers everywhere, and a need to communi- cate. Problem: communication and cooperation are difficult. How do people on the same project share files? How does new software get distributed to all users? How is electronic mail handled? Solution: tie machines together with networks, develop message protocols that allow communication and coopera- tion again. Goal: ideally, we would like all computers to look like one very large, unified system. We could share files, communicate, etc. as if it were a timesharing system. But we will always be able to tell the difference, -- .1 -- due to performance. Wide Area Networks - networks that connect sites that are geographically apart. Local area networks - LANs: Developed mid-70's to hook together personal computers. Most popular interconnec- tion for LANs is Ethernet. LANs are used very differ- ently than wide-area networks. Examples of networks: ARPAnet: (Defense Advanced Research Projects Agen- cy) - 1st and most famous network, developed early 70's but still in use. Connected together large timesharing systems all over the country using leased phone lines. Provided mail, file transfer, remote login. ARPANET used IMPs (Interface Message Processors) as routers and TIPs (Terminal Interface Proces- sors) to connect from a terminal. Usenet: Developed late 70's, early 80's. Unix sys- tems phone each other up to send mail and transfer files. CSnet: developed to be less expensive clone of Arpanet, and tie together CS departments. BITNET: ties together mostly sites using IBM equip- ment, including a lot of physics laboratories. VNET - IBM's internal corporate network. Has highly -- .2 -- secure gateways to connect to CSnet. DECNET - DEC's network system. Name of product, and also refers to corporate communication system. Misc. commercial nets, such as America Online, Prodigy, Compuserve. Internetworks: mechanisms for tying together many existing networks, such as ARPAnet, Usenet, and LANs. The Internetwork is the combination of most of the above. They are now widely interconnected. Network Hardware LAN Usually ethernet, which uses either shared ca- ble, or wires from each machine to a hub or switch. WAN Point-to-point links (used by most early net- works). Examples are leased phone lines (50k bits, DARPAnet), RS232 connections, T1 and T3 lines, regular phone calls, satellite links, etc. Network Topologies: Fully connected: every site can talk directly to any other site (e.g. Usenet). Partially connected: star and ring are most popu- -- .3 -- lar. Intermediate nodes must forward messages. Multi-access bus / broadcast - (used by most LANs today). A single cable or group of cables connects many machines together. Best example is Ethernet (one wire). Alternative is radio broadcast. Network Performance Parameters Networks are usually characterized in terms of two per- formance parameters: Latency: the minimum time to get the minimum amount of information between two sites. Note the difference between transmission laten- cy, which is time for a given bit to get from one end to the other after the connection is set up and set-up latency - which is time to get the first bit there. Bandwidth: once information is flowing, how many bits per second can be transmitted (i.e. the marginal cost per bit). Note also cost. Protocols: These are the key to networks. A protocol is just an agreement between the parties on the network about how information will be transmitted between them and what the information format is. There are -- .4 -- many different protocols to do different things (e.g. mail, file transfer, remote login). Typical- ly, protocols are built up in layers. Section 15.6 of the book lists the 7 ISO protocol layers. Hierarchical protocols - relate layers in a giv- en system. Can be network system, operating system, etc. Peer to Peer Protocols - relate the same layer of different systems. Rely on lower layers to actually communicate across machines. ISO Protocols 1. Lowest protocol layer: physical layer. Deter- mines the electrical mechanisms for transmitting bits: voltages, delays, currents, etc. 2. Data Link protocol layer: how to get packets be- tween two directly-connected components. Includes error detection and recovery from physical layer. (frames?) 3. Network layer - Responsible for providing connec- tions (to nodes that are not directly connected) and routing packets. Takes care of addresses. (Takes care of route changes due to changing loads.) 4. Transport layer - low level access to network. Breaks messages into packets, keeps packets in or- der, flow control, physical address generation. (Takes care of retransmission of lost or destroyed -- .5 -- packets.) 5. Session layer - process to process protocols. 6. Presentation Layer - resolves differences between sites in formats (e.g. character types, number rep- resentation, full/half duplex, etc.) 7. Application layer - interacts with users. Sup- ports electronic mail, distributed data bases, etc. Wide area networks are usually built up of Local Area Networks (LANs), which are interconnected. Local area networks are usually some type of broadcast network. Broadcast networks: single shared communication medium, no central controller to allocate access to it. Simplest scheme is the Aloha mechanism: just broad- cast blindly, use recovery protocols if packet doesn't get through. This system has stability problems: can't get more than 18% utilization of channel (1/2e), and system completely falls apart under heavy loads. Aloha system uses satellite - no choice. Can't listen - delays too long (1/4 second). Can be improved with "slotted aloha" - messages occupy slots. (1/e) - doubles bandwidth. Ethernet (using a physical coax cable) adds two things. -- .6 -- First is carrier sense: listen before broad- casting, defer until channel is clear, then broadcast. Also listen while broadcasting. Collision can still happen if two stations start up at exactly the same time. If collision detected, jam net- work so that everyone will know about collision (don't waste time transmitting junk). Then wait a ``random'' interval, retry. If repeated col- lisions, wait longer and longer intervals. This is called CSMA/CD (carrier sense multiple access, with collision detection). Ethernet Frame: destination address (6 bytes), source address (6 bytes), type (2 bytes), data (46-1500 bytes), frame check sequence (4 bytes). Problem with Basic (original) Ethernet: Reliability: if any station jams network, nobody can do anything, can't even figure out who's doing it. Fairness: there's no guarantee against starvation. People with real-time needs don't like this. Bandwidth limited to cable (10 mbits) Original ethernet limited to site which can be connected by cable (4000 feet). Longer connections with switch (I think?) -- .7 -- Security - relatively easy to listen to all traffic, and/or tap cable. More Recent Ethernet Designs: Use Switch to route, rather than shared ca- ble Rates of 10Mbit, 100Mbit, 1Gbit/sec. 10G under development. Wireless ethernet (801.11a,b,g) Continues to use most of ethernet protocol- frame format, timeouts, collision detect, software stuff. Ring networks: - these are a type of broadcast network. An additional protocol built on top of a ring-structured set of point-to-point links. Normally, an electronic token (special packet) cir- culates at high speed around the ring. If a station doesn't have anything to broadcast, it just retrans- mits everything it receives. When ready to broadcast, a station waits until the token passes by. Instead of retransmitting the to- ken, send packet instead. When packet has been transmitted, put token back on ring for next station to use. Packet loops all the way around, gets swallowed by sender when it comes back again (recognize self as destination). -- .8 -- Problems with ring system: If any station dies, token can't circulate so ring dies. If token is missing, system dies. Starvation is possible. If a second token is created, system can get messed up. We can use Ethernet or Ring Network for local net. Need some way of constructing (structuring) a wide area net. I.e. build an "internetwork". Three methods for link between two machines: Circuit switching - like telephone system - you have circuit between the source and destination machines. Packet switching - communications are broken into packets and sent piece by piece. Note that packet switching can be used to build a virtual circuit - looks like circuit, but ac- tually packets on a shared medium. Message switching - a virtual circuit exists for long enough to complete a message, and then the cir- cuit is dropped. Or can use physical link. Getting stuff where you want it. Packets must be forwarded from machine to machine until they reach the destination. Machines that -- .9 -- forward between networks are called gateways. Prob- lem is how to get stuff where you want it. Names vs. addresses vs. routes: Name: a symbolic term for something: ``Robert'', or ``ucbcory''. Good for people to remember. Address: where the thing is: in an internetwork situation, usually consists of the number of the network, the number of the site on the network, the id of the host at the site, and sometimes a more specific host (e.g. a workstation). E.g. jones@chaos.netnode.berkeley.edu Route: directions for how to get there from here (a sequence of hosts and links to pass through to reach the destination). Sometimes the sender has to provide the route, e.g. in UUCP: hplabs!hp-pcd!hpcvc0!cliff. All each machine has to be able to do is remember its neighbors and forward messages. This is clumsy for users. It's better if the hosts of the internetwork can figure out the routing stuff for themselves. This involves a special protocol between the hosts to build routing tables. E.g. in the Internet, hosts send messages to nearest neighbors, build up tables of most direct paths from each host to each other host (fewest hops). -- .10 -- Difficulty with routing tables is that they get to be very large. Note that routes can change dynamically. There can be more than one way to get from A to B. Note the problem if instability in routing if changed for performance reasons. In LANs, only gateways have to worry about rout- ing: all the other hosts just ship packets to gateway unless for host on local net. Communications Problems: Packets can get lost: Transmission errors. Address is corrupted, and packet circulates forever. Contents of packet are corrupted. A host has all its packet buffers full so it has no place to put another incoming packet. Can happen at intermediate host if packets are arriving on a fast network but have to be forwarded onto a much slower network. This is called network congestion. Can happen at destination if user process can't work fast enough to process all the packets as they arrive. Receiver is down, and sender sends anyway. Packets can arrive out of order: if some hosts sud- -- .11 -- denly go down, or if routing tables change, packets might wander off into the network and come back much later. Most protocols include a time-to-live mecha- nism: after a certain time, packets are killed so that they don't wander endlessly. Datagram protocols: used to deliver individual packets; the packets are not guaranteed to get through or to ar- rive in any particular order. This is useful for some applications, but not very many. Most applications would like guarantees about delivery and order. This is called a connection, and the protocols to implement it are called virtual circuit or transport protocols. To do this, the sender and receiver must remember state about what has been happening. Simple acknowledgement-based protocol: Store a serial number in each packet. Sender assigns serial numbers, increments for each packet. Sender sends one or more packets. Receiver sends an ack acknowledgement packet for each packet or group of packets. Sender waits for acknowledgement before sending next (group of) packets (must also -- .12 -- save old packets!). If sender doesn't receive acknowledgement within a reasonable time, it assumes that the packet got lost and retransmits it. Retransmission could result in receiver get- ting two packets with same serial number: it checks serial numbers and throws away du- plicates and out-of-order packets. Sender and receiver must negotiate about how far ahead the sender can send: otherwise the re- ceiver might run out of buffer space and have to discard packets. This is called the flow con- trol problem. No matter what the virtual circuit mechanism, setting up the connection is complex and time- consuming. It's tricky to get two hosts to agree to communicate with each other and get their state initialized correctly. TCP/IP TCP/IP is collection of network protocols making up the Internet Protocol Suite. History: 1969- Arpanet with 4 nodes. (SDC, UCSB, UCLA, SRI) 1972- Arpanet Demo (50 hosts) mid-1970s- TCP developed, running on Unix (DEC -- .13 -- PDP-11) early 80s- Berkeley Unix. Runs TCP. 1983- Arpanet converts to TCP/IP. In use by Sun. ISO Levels: Level 3 - Network Layer: IP - Internet Protocol - provides host-to- host datagram delivery. Provides packet routing, will insulate higher levels from network specific characteristics (e.g. packet size). Fields of IP packet header include: version, header length, total length, ID (same for all fragments of datagram), time-to-live, checksum, source address, destination address. IP address 32 bits. (newer version longer). Broken into four 8-bit seg- ments. Addresses allocated in blocks. ICMP - Internet Control Message Protocol - used by gateways and hosts to approse other hosts of conditions related to their IP ser- vices. (e.g. routing, congestion) ARP - address resolution protocol - Maps an IP address to an associated ethernet ad- dress. (32 bits -> 48 bits). -- .14 -- RARP - Reverse ARP- Maps an ethernet address to an associated IP address. Level 4- Transport Layer: TCP - Transmission Control Protocol: connec- tion oriented, reliable, byte-stream proto- col. TCP packet header includes: source port (identifies process or service in sender), destination port, sequence num- ber (32 bits), acknowledgement number, control flags (SYN (connection request), ACK, RST (reset), FIN (end)), window (window size- number of packets that will be accepted), checksum. Provides means to connect with a socket [IP address, port number]. Takes care of timeouts, retransmissions, flow control. Some well known ports: 20, 21 (FTP), 23 (Telnet), 25 (SMTP) UDP - User Datagram Protocol - unacknowl- edged transaction-oriented protocol parallel to TCP. Levels 5-7: Session, Presentation and Applica- tion Layers: SMTP- Simple Mail Transfer Protocol DNS - Domain Name Service- maps names to ad- -- .15 -- dresses Top level is Network Information Center (NIC) computers. FTP - File Transfer Protocol Telnet - provides virtual terminal services. The mechanisms described above form the basis for tying together distributed systems. So far, though, they've only been used for loose coupling: Each machine is completely autonomous: separate ac- counting, separate file system, separate password file, etc. Can send mail between machines. Can transfer files between machines (but only with special commands). Can execute commands remotely. Can login remotely. Loose coupling like this is OK for a network with only a few machines spread all over the country, but not for a high-performance LAN where every user has a private ma- chine. What would we like in a distributed computer system? Unified, transparent file system. Unified, transparent computation - from any termi- nal, you can run on any machine, transparently. -- .16 -- (You actually shouldn't care which machine you're running on.) Load Balancing, process migration, file migration. Local area networks can more or less provide that now. Wide area networks cannot provide this transparenly, due to performance problems. May be possible in future? Distributed File Systems Remote files appear to be local (except for perfor- mance). Issues: Failures - what happens when remote system crashes. Performance - remote is not same as local. Can do some caching. Sun's NFS (Network File System) NFS permits the mounting of a remote file system as if it were local. Therefore, by using mount com- mands, can set up transparent distributed file sys- tem. Caches file blocks, descriptors at both clients and servers. Write-through caching. When file is closed, all modified blocks are sent immediately to server disk. "Close" doesn't return until all bytes stored on disk. -- .17 -- Consistency is weak. Polls periodically for changes to file; may use old version until polling. May have simultaneous conflicting updates. Server keeps no state about client (except hints, for performance). Each read gives enough info to do entire operation. (I.e. Read(I#, po- sition). When server crashes and restarts, can start again processing requests immediately. All requests are "idempotent" - i.e. can be re- peated with no ill effects. So if message may be lost, can resend (and possibly redo). Cache Consistency Issue is that in a system with caching, there can be many copies of a given piece of data. If any of those pieces is written, it becomes inconsistent with the other pieces. Problem can occur in distributed systems (as well as CPU cache memories). Goal: distributed system (or shared file system) yields same results as unified, uniprocessor system. Caching can occur in disk controller, in SAN or NAS, in main memory disk cache, etc. Solutions: Only one cached copy at a time. -- .18 -- Very inefficient, if file is read only or read mostly. Approaches: (a) Many read copies (b) One write copy (c) Update many write copies. Several solutions: i. "Open for write" causes all other cached copies to be deleted. ii. Write to a block deletes all other cached copies of block. ("write invali- date") ("write through" - update backing store copy also.) Need to lock block, delete other copies and then update, to avoid crossing up- dates. iii. Write to block is broadcast to all oth- er cached copies (and backing store?) ("write update") Need to lock all blocks before updating any. Optimistic approaches Don't do the locks - hope it works out. Don't do the locks, but keep old copy. Backout if necessary. Leave it to the users to worry about. -- .19 -- (Unix, Database Systems.) Note: need to know where/how to find all of the copies. In distributed system, there is no way to do "snoopy" coherence. I.e. can't see all block and file requests. System "owning" file (usually the processor in the system where the file lives) must keep track of who has copies. If broadcast-writes is used, can give list of cached copies to any system writing the data. "False Sharing"- a block is shared, but the data in it isn't shared. I.e. process A is using subset X of the block, and process B is using subset Y of the block, but X and Y don't overlap. This problem can have a big effect on the perfor- mance of a cache consistency scheme. "Write Merge" - if there are several writes to a block, while it is in cache and before it is written back to disk, can "merge" the writes, so that only cumulative update is written back. Process Migration Want to take running process and move it to another -- .20 -- computer. Why is this hard? Need to save and transfer state. Need to maintain all connections (to OS, to network, to other systems, to I/O devices, to file system). Connections can be forwarded. Connections can be reconnected. Obviously doesn't work if CPU architecture is different. (Object code won't run.) Very hard if OS is different - they have to be interoperable - all the same system calls, etc. Parallel Programming and Amdahl's Law N processors will never get you an N-times speedup. Part of the computation is not parallelizable. (At least the code that distributes and gathers the computation is not parallelizable.) Amdahl's Law - speedup is limited by sequential component. Parallel programming also requires communication and sharing, both of which impede performance.