Lecture Notes for April 20, 2005 *************************************************************** ANOUNCEMENTS: * Professor Smith still has the midterms that weren’t picked up. * Karl will give a part of the lecture on security next week. * HKN passed the evaluation forms for CS162 **************************************************************** Counter measures for security problems: +there is no perfect solution for the problem of security. Here are few possibilities that you can do to reduce the vulnerability. LOGGING: +All the commands get logged in. what happened, who were the users, what was the account. All the important actions and uses of the privileges get recorded in inventible file. +You should put these files in a safe place such as a hardcopy from a printer or stored in a network where users have access. This type of logging can be used to catch imposters during their initial attempts and failures. For example: all the attempts to specify an incorrect login or password can be stored. AUDIT TRAIL: +It basically is the same thing as logging. Audit Trail is a chronological record of system resource usage. This includes user login, file access, other various activities, and whether any actual or attempted security violations occurred, legitimate and unauthorized. +Just like logging audit trail can be edited or removed by the super user. Therefore, store a hardcopy like a printout, write to a once media. +In UC Berkeley, audit trail store is a file server which is separate from user machines. If the system has been broken, the audit trail itself may be vulnerable. +Even better is to get humans involved at key steps. For example: Two people are required in opening a lock or safe. This is what is done in Electronic Fund Transfers. PRINCIPLE OF MINIMUM PRIVILEDGE: Also known as "need-to-know" principle of minimum privilege states that give the users only the access they need to perform the necessary task. Give them the access for the minimum amount of time.This helps a lot in minimizing accidental or intentional errors. For example: the file system should not be able to access memory map, the memory manager cannot access the disk allocation tables. Capabilities are an implementation of this idea Capabilities indicate which objects may be accessed, and in what ways, by each user. a list of pairs is stored with each user, called a capability list. The user process typically cannot access its capability list directly. The OS manages the list, which makes it difficult for a process to forge a capability. In access-list systems, the default is usually for everyone to be able to access an object. In capability based systems, the default is for no one to be able to access an object unless they have been given capability. There is no way of even naming an object without a capability. Capabilities have sometimes been used in systems that need to be very secure. However, capabilities can make it difficult to share information: nobody can get access to your stuff unless you explicitly give it to them. Capabilities are difficult to revoke. Example of a simple capability-based protection scheme: file descriptors. CONFINEMENT PROBLEM: Is the problem of assuring that a borrowed program does not steal for its author information that it processes for a borrower. An approach to proving that an operating system enforces confinement, by preventing borrowed programs from writing information in storage in violation of a formally stated security policy, is presented. It is very hard to provide fool-proof information containment: example, a Trojan horse could write characters to a tty, or to take page faults, in Morse code, as a signal to another process. CORRECTNESS PROOF: a mathematical proof of consistency between a specification and its implementation. These are very hard to do. Even so, this only proves that the system works according to specification. It doesn't mean that the specification is necessarily right, and it doesn't deal with Trojan Horses. They work for proving small algorithm. Not very useful. CALLBACK USED TO AVOID ABUSE OF ACCOUNT: The basic idea is that you dial to the machine and tell it who you are and then machine calls you back. If you provide the wrong number to dial back, then it can be determined that you are not the right person. At IBM they used to use this feature, when an employee calls in the computer would disconnect and call back. You could only log from a given home number. This requires another extension of network though. CONSISTENCY OR PLAUSIBILITY CHECK: Application system, for example credit card companies do plausibility or plausibility checks. These systems look for suspicious activities. If anything unusual is seen, the companies confirm with the customer about the activity. For example if user all of a sudden spends $10,000 where as his usual purchases are under $100. The user lives in United States, and there is an account activity in Hong Kong. INFERENCE CONTROLS: + You have a statistical database. For example US Census collects data about people. Then the researchers run queries on this database. e.g. , what is the average salary of the person with age 25? +But you do not want to provide information about individuals. These databases are designed to not to provide individual data. +You can still design queries in such a way that will give you the information about individuals. +. E.g. (a) average salary of all X. (b) average salary of X-delta, where delta describes only one individual. (c) size of X. These three queries permit us to deduce delta salary. This is a problem. How should you prevent systems from giving away individual's information while allowing statistical information. +There is no good solution for this problem you can do few things : +Randomize data (slightly) : if you start randomizing numbers, you wont be able to give them good precision. +limit the queries: You can limit the class of queries you can make. Just give them a set of predefined queries. +You can only have certain groups aggregations. THE CONFINEMENT PROBLEM: You want to make sure that the time sharing system, would not pass your data to unauthorized user. + Problem of mutually suspicious customer and service. + we want to make sure that service can only access the data that is provided by the user and the service is protected from the user as well + Idea is concept of information utility. Idea currently resurfacing as server based software. + TWO PROBLEMS: + a program might not behave exactly as it is intended to. It might steal the information like transmit the confidential data. + LIST OF POSSIBLE LEAKS: There are still ways to leak information. +if the program have access to the memory, it can collect the unauthorised data as well. +it can write that data to a file, or on remote server. + the program can even write the data to a temporary file. This temporary file can later be read by a spy program. + You can signal the other process controlled by its owner, if the file systems have interlocks, the service can try to lock unlock the file, and the spy program can watch if the file is locked. Whenever he finds the file is unlocks, spy can access it. +You can encode information in paging rates. The service can intentionally vary the paging rate and signal the spy program. High rate could mean 1 and low rate could mean 0 + VIRUSES: + Most of the virus problems appear in PCs which mostly run on Windows. In Windows everything is executable. The first thing a program does with data is to try to execute it. PC transfer the executable files and code which programs execute. For example, a lot of the machines get infected by email viruses. In Unix, code is code and data is data. Nothing gets executed unless you execute it explicitly. +Once the virus is in your machine it will replicate itself in different parts of the machine and does unpleasant things to the machine. + The General Idea behind searching for viruses is that you look for their object code. +The anti-viruses have a list of commonly know viruses object code. + when you check a file for the viruses, the software compares the binary image of the file segments with the known images of viruses. + Some viruses are smart and they would encrypt themselves, so the anti-virus cannot compare them. + The solution is that you decrypt the code before you try to compare it. + Some viruses even change the decryption code. + The solution then is to execute the suspected virus code in isolation for a small amount of time, and see if the code decrypts itself into something that is recognized as a common virus. + There is no good defense against the viruses. if the virus has a complete new code pattern that the anti virus software doesn’t know about then the anti-virus wont be able to catch it as virus. **************************************************************************************** ENCRYPTION +Recommended books by Prof. Smith on encryption , "Codebreakers" give the history of cryptography all the way to old days, "Privacy and Authentication" by Whitfield Diffie and Martin Hellman. "An Introduction to Cryptography", Proc. IEEE, 67, 3, March, 1979, pp. 397-427. + Popular approach to security in computer systems: Encryption + Definition: You start with a clear text. You encrypt it with a key. This become cipher text. The ideas is that the people shouldn’t be able to read this cipher text. The text gets to the destination. The intended users decrypts it with the decryption key. This becomes plain text again and can use it. +You only send the data in encoded form. Cryptography makes the data useless to one's opponents. The idea of encryption is not new, it has been used since the times of Romans - "Caesar Cipher". THE BASIC MECHANISM +Start with the initially readable text, called clear text. Encode it to make it cipher text + A listener can see the cipher text but wont make any sense to him. + The encryption is controlled by the secret key. + Decode the cipher with key into clear text. + The encrypted text can be stored in a readable file, or transmitted over unprotected channels. DIAGRAM Key1 Key2 V V V V V V clear text ->>> encrypt with Key1>>> becomes cipher text >>>decrypt it with Key2>>>clear text V V V V listener ALL ENCRYPTIONS WORK UNDER THREE CONTIONS: + The encryption function shouldn’t easily invertible. You should not able transform cipher text to clear text without the decryption key + The encryption and decryption must be done in some safe place so the clear text can't be stolen. + The keys must be protected. In most system, encryption and decryption keys are the same. So, you cannot afford to leak either of them. TYPE OF CRYPTOGRAPIC SYSTEMS: + Substitution: There is a function f(x) which maps each letter of the plaintext (or group of letters) into f(x). f(x) must be 1-1 or one to many. If f(x)=x+1, then its called a Caesar Cipher. + Example : the quick brown fox jumps over the lazy dog>>> uif rvjdl cspxo gpy kvnqt pwfs uif mbaz eph + This type of encryption can be solved by using tables of frequencies of letters, doubles, triples, etc. +You use a frequency of table, for the very common letters such as e. This works for one to one map. + If the f(x) is one to many then the frequency tables get messy and are not helpful + Transposition: Permute (or transpose) the input in blocks to obtain the output. |----|----|-----|-----|----| |T | H | E | | Q | |----|----|-----|-----|----| |U | I | C | K | | |----|----|-----|-----|----| |B | R | O | W | N | |----|----|-----|-----|----| |F | O | X | | | |----|----|-----|-----|----| The clear text is read horizontally (THE QUICK BROWN FOX) and the vertical text can be treated as cipher text (TUBF HIRO ECOX KW Q N) +You can break it by , trying various common permutation to find the Polyalphabetic cipher: + Look for permutations that rejoin commonly used letter pairs, such as "th". +Monoalphabetic Cipher is not very secure. Frequency analysis makes it possible to break it. Therefore a better solution is Polyalphabetic Ciphers. + Polyalphabetic Ciphers - substitution cipher, where f(i,x) is a function of i, which is the sequence number of the letter in the text. Typically periodic in i. Can get long periods by using two functions with relatively prime periods. + Polyalphabetic Ciphers can be broken in two steps: If we can find length of the block or the length of the key we can apply the frequency analysis, to letter every block apart. +We look for the repeated letters and count the numbers between them. +Least common denominator of distance between strings is the period. + Thus we can look at frequencies of letters K apart, until they look ok, then K isperiod of cipher. Then solve each of N ciphers separately, using frequency methods. + Old fashioned coding machines (e.g. Hagelin machines) worked as polyalphabetic cipher - had rotating wheels with relatively prime number of cogs. Code was pro- duct of path through wheels. RUNNING KEY CIPHER- +the running key cipher is a type of polyalphabetic substitution cipher in which a text, typically from a book, is used to provide a very long key stream. Usually, the book to be used would be agreed ahead of time, while the passage to use would be chosen randomly for each message and secretly indicated somewhere in the message + Solve: use probably word; substitute it everywhere (i.e. XOR it with the cipher text) and see if a recognizable word pops out. If so, work backward and forward by context. Or, use frequency methods - but frequencies are now products of key and message frequencies, so quite hard. CODES + In Codes you are using linguistic units of input. So you have code books for the words. You can map words to output using the book. Its very hard to break the codes without code book. Can also encode phrases + Typical approaches to break such encryption are : frequency counts, probable words, known plaintext. + OTHER APPROACHES TO BREAK CODES: + Traffic Analysis: Traffic analysis is the process of intercepting and examining messages in order to deduce information from patterns in communication. It can be performed even when the messages are encrypted and cannot be decrypted. In general, the greater the number of messages observed, or even intercepted and stored, the more that can be inferred from the traffic. + Playback: Listeners play back the same encrypted messages, this confuses the people who are exchanging messages. + Sometimes Operator errors can help decode the encryption If the Operator encrypts the message with wrong key, he might encrypt the same message with a different key again. Two versions of encryption can help to decrypt the message. + In most of the encryption system, the distribution of keys safely is the biggest problem. + In any encryption, if the effort to break the encryption is more than the actual value of the information. Then the encryption is considered successful. + Example: For high school gossip, simple monoalphabetic cipher is good enough. + The most secure way of encryption is to use random key as long as the message. + Hardest to maintain though. + All the other systems, can be broken given enough messages. + Error Control is a major problems too. If the bits get dropped or corrupted it becomes very hard to recover the message. + PROBLEM: how to distribute keys secure in the first place ? + One method for key distribution. + Let KS be the key server. Use key KX to communicate between user X and key server KS. Users A and B. Let "**" mean encryption. Let ID be the message ID. Let KAB be the new key. + 1. A to KS: {A,(ID,B)**KA} [A asks server for key to com- municate with B] + 2. KS to A: {(ID,KAB,(KAB,A)**KB)**KA} [gives the key to A, with a unique ID, so that the message is identifiable.) + 3. A to B: {(KAB,A)**KB} [send key to B, so only B can read it.]