Lecture Notes - 4/25/2005


*************************************************************************
*			ANOUNCEMENTS:					*
*   Professor Smith still has the midterms that weren't picked up.	*
*   Karl will present on Security using powerpoint			*
*									*
*************************************************************************

Main Topics to be discussed today:

1. DES
2. Pretty Good Privacy  
3. Public key encryption  
4. Safe Mail 
5. Digital signatures
6. CLIPPER Chip   
7. Karl's Presentation
8. Introduction to Virtual Machine


 **************
       DES          
 **************
+   Federal data encryption standard (DES).  Can  be  implemented

         efficiently in hardware and appears to be relatively safe.

         +    A key consists of 64 binary digits of which 
         	
         		- 56 bits are randomly generated and used directly by the algorithm. 
         		- 8 bits are used for error detection.
         		
         +   Block cipher.  Encrypts 64 bits block at a time, breaking it up to 4 bit pieces, 
         	
         	 and these 4 bit pieces intereact with each other(substitution, permutation). 
         	 
         	 If we place two blocks to each other, we have 128bits, a two level DES encryption.
         
                  Hard but not impossible to break. Chips for breaking it are available on the market. 

         +   NSA(National Security Agency), also known as "No Such Agency"(Prof's joke), 
         	
              the govt. doesn't want cheap and effective encryption - would no longer be able to read third world traffic.

         +   There are chips that encrypt/decrypt megabits per second.

         +   DES no longer considered safe enough by NSA.  The  latest

              standard  is  the  CLIPPER chip.  For practical purposes,
 
              DES more than adequate.

             +   Sufficient security is obtained by two  level encryption in pairs.

         +  Export or Import of Encryption chips like DES requires license.
         
         - DES is a kind of conventional encryption. Using conventional encryption alone as a means for
         
           transmitting secure data can be quite expensive simply due to  the difficulty of secure key distribution. 
           
           Thus, Public key encryption was introduced.
         
         
 *************************
       Pretty Good Privacy       
 *************************

     +   PGP - pretty good privacy - public domain encryption  system. Based on DES.
     	
     	Encryption
     	  
     	   1. First compresses the plaintext
     	   
     	   2. Generate  a one-time only session key randomly from the random movement of mouse and the keystrokes.
     	   
     	   3. Session key is then encrypted to the recipient's public key.
     	   
     	   4. Transmit ciphertext along with public key-encrypted session key to receiver.
     	   
       Decryption

         1. Receiver uses  private key to recover the temporary session key
         
         2. plaintext is decompressed


 **************************
        Public key encryption        
 **************************
     +  new  mechanism  for  encryption  where

         knowing  the  encryption key doesn't help you to find decryp-

         tion key, or vice versa.

         +   Private Key and Public Key are inverse of each other.

         +   Two keys are not derivable from each other. 
         
         +   Private key is the ONLY inverse of the public key, so senders' identity can be verified.

         +   Each user keeps one key  secret,  publicizes  the  other.

             Can't  derive  private  key from public key.  Public keys

             are made available to everyone, in a phone book for example.


     +   Specific scheme for public  key  encryption  (pages  471-472, chap 14, of Silberschatz and Galvin):

         Encode(m) : E(m) = (m^e) mod n = C                          
         
            where "e" is the public encryption exponent ; "n" is the public encryption modulus,  n> 0 , 0 <= e <= n-1
         
         
        Decode(C)  : D(C) = (C^d) mod n = m
         
         		where "d" is the private decryption exponent,    0 <= d <= n-1

    
         +   Must derive e, d, and n such that the above decode is in-

             verse of encode.

             +   Let n=p*q (p, q large primes).

             +   d is large integer relatively  prime  to  (p-1)*(q-1)

                 (i.e. GCD[d, (p-1)*(q-1)] == 1

             +   e is chosen such that (e*d) mod ((p-1)*(q-1)) ==1

             +   ICBS(It Can Be Shown) that this makes E and D inverses. Proof requires Number Theory.

         +   This is safe because although n is known, p & q  are  not

             known,  and  so e cannot be derived.  (factoring is known

             to be hard.)

   
 ******************
 Safe Mail          
 ******************

         +   Every sender uses the same public key publized by a destination user(receiver) to encrypt mail.

         +   Anybody can encrypt mail for this  user  and  be  certain

             that only the user will be able to decipher it, since nobody but the user keeps the private key.

         It's a nice scheme because the user only has to remember  one

         key, and all senders can use the same key.  However, the user  doesn't
         
         know for sure who he's getting mail from if every sender uses the same public key.
         
                          __________________________      Internet    _________User_______________         
   Sender A Mail  -----> |  Public key encryption   |  ------------> |   Private key decryption   |  --------> Mail 
   Sender B Mail  -----> |__________________________|                |___________________________ |
                                         	                                                                         

 ******************
 Digital signatures 
 ******************

     +   can also use public keys to certify identity:

         +   To certify your identity, use your private key to encrypt

             a  text  message, e.g. ``I agree to pay Mary Wallace $100

             per year for the duration of life.''

         +   You can give the encrypted message to anybody,  and  they

             can  certify  that  it  came from you by seeing if it de-

             crypts with your public key.  Anything that decrypts into

             readable  text  with  your public key must have come from

             you!  This can be made legally binding as a form of elec-

             tronic signature.

         +   Note that only encrypting with your private  key  permits

             the mail or message to be read by anyone.

             +   If you encrypt with your private key, and then  some-

                 one else's public key, it can only be read by intend-

                 ed recipient.


 		             _____________________  Internet _____Receiver____________            
       Mail(Sender A)----> |Private key encryption | ----> |   Public  key decryption|--------> Mail by Sender A
                           |key kept by sender A   |   - > |key publized  by sender A|-----
                           |_______________________|  |	   |_________________________|	  |
                         			      |					  |			
                                               	      |                                   |                                 
                           ________________________   |                                   |
       Mail(Sender B)---> |Private key encryption |---                                    |                                 
                          | key kept by sender B  |                                       |__> non-readable text
                          |_______________________|                                                    


     +   One public key method believed to work:  Publish a large com-

         posite  number  (public  key).  Private key is factors of the

         number.  Factors hard to obtain.


     +   Encryption appears to be a great way to thwart listeners.  It

         doesn't help with Trojan Horses, though.


     +   One Way Encryption - use to encrypt password file. Don't have

         to be able to decrypt it - just compare encryption of submit-

         ted password with stored one.  Can't deduce what needs to  be

         submitted.   (I.e. encryption algorithm should not be invert-

         able.)


     +   General problem:  how do we know that an encryption mechanism

         is  safe?   It's extremely hard to prove.  Mention example of

         scheme that was recently disproven after being widely accept-

         ed  -  knapsack  problem.   This is a hot topic for research:

         theorists are trying to find provably hard problems, and  use

         them for proving safety of encryption.


 ******************
   CLIPPER Chip     
 ******************


     +   Replacement for DES,  developed by NSA using Skipjack algorithm(SECRET)

     +   Chip Contains:

         +   64-bit block encryption (algorithm classified)

         +   Uses 80 bit keys.

         +   Skipjack algorithm can be more secure than DES , since it uses 80-bit keys and scrambles 

	     the data for 32 rounds;by contrast, DES uses 56-bit keys and scrambles the data for only 16 rounds. 
                
         +   Uses the following numbers:

             +   F - 80-bit key used by all Clipper chips

             +   N - 30-bit serial number (per chip)

             +   U - 80-bit secret decryption key for this chip only.


         +   Secure conversation occurs this way:

             +   Session key K is negotiated (somehow).

             +   E(M;K) is encrypted message stream.

             +   E(E(K;U), N; F) is a "law enforcement  block".   With

                 F,  we  can  get  E(K;U),N.    From  N,  (the  serial

                 number), we can get U (held by federal agencies), and  then can get K.  From K, we can decrypt messages.


         +   Key U is xor of U1 and U2.  U1 and U2 held  by  different

             federal agencies.  Can get both U1 and U2 only with court

             ordered wiretap.


	----------------------------------------------------------------------------------------------------------	
						Karl's Presentation on Security
								
								
 ******************
  Hash Functions    
 ******************

+ Store hashed password instead of password, so that we can verify a password, but not retrieve it.

	+Dictionary attack:  pre-compute on dictionary words their hash mappings.
	    				eg. salt ---> nachos
	    				
	   				Solution: add a sequence of random prefixes with length n, xxxxx, then
	   				
	   				"xxxxxsalt" is stored. So every password is unique and 
	   				
	   				the required reverse dictionary n times bigger. 

+md5("nachos") = 08b54e0e6795d86536b8a082b2e1c30f
	
	+No "reverse-md5" to get "nachos" from 08b54e...
	
	
 ******************
       Authenticity       
 ******************

+ Don't want message to be 
	
		1. alterted or modified by attacker.
		2. re-ordered by attacker
		3. replayed by attacker


	Solution to ordering & replay:
	- Sequence number 
        - "I 'm finished" message
 
 ***********************************
  Symmetric vs Asymmetric  Encryption   
 ***********************************
 +Symmetric encryption (a.k.a. private-key 
    encryption)
     - Encryption and decryption share the same key
     - Therefore encryption key must be secret
 
 
     Examples of ciphers:
      - Caesar cipher
      - Transposition cipher
      - DES, 3DES
      - AES, Blowfish, CAST, ARCFOUR
  
 +Asymmetric encryption (a.k.a. public-key 
 encryption)
 - Encryption and decryption use a different key
  -   Private key is two large numbers, public key is its product
 - Therefore encryption key can be public

Examples of algorithms:
- RSA by Ron Rivest, Adi Shamir, and Leonard Adleman
  -   DSA (Digital Signature Algorithm)

Depend on the difficulty of factoring large numbers

Implementations:
- PGP, GPG (email & general-purpose)
- SSL (Secure Socket Layer) (https, imaps)
- SSH


 ******************
     Certificates          
 ******************

Assume Bob knows Charlie,  Charlie knows Alice. But Bob doesn't know Alice


How does Bob know Alice's pub key?
- e.g. millions of websites exist

Does your browser know ALL of their public keys?
- Maybe Bob knows Charlie, Charlie knows Alice

Charlie signs message saying "Alice's key is 89fc76eb"   and send it to Bob, 

so Bob can know Alice's public key.  => Key Ring

Example.  Bob = your browser ; Alice = eBay.com; Charlie = Verisign.com
		
		-  Browser just knows Verisign's public key
		-  eBay gives you a message saying "ebay.com's public 
		   key is 89a13ef" signed by Verisign


 ******************
   Attacks on TCP     
 ******************


+ Source spoofing


IF we want to send SYN/ACK to a destination address after receiving SYN, that address is specified in the SYN packet.

But how do we know that address is really the one you want to send to?

We ask the destination address to reply to our SYN/ACK to verify. 

Each SYN/ACK contains a randomly generated  initial sequence number(ISN), 

so attacker cannot predict the ISN to be ISN+1 or ISN+2.....


+ SYN flood

When we receive SYN, we create state(in Nachos, add to "pending connections" list).

In TCP, there is a limit to state size(max 6 pending connections) and 

un-acknowledged SYN ACKs expire after a minute. So, if  we send 6 SYNs/ min, we can revent anyone

else from connecting(Denial Of Service).

Sol. SYN cookies - Don't create state(add to "pending connections") until we receive the packet after SYN/ACK.


 ************
   Worm        
 ************

A worm, unlike Virus, doesn't require human intereaction.

A worm can infect the entire Internet in an hour!
- Logistic curve: infects until no one else to infect
- Most of these worms written by amateurs

Traditional defenses use signature-based scanner
- Require a human to analyze the worm and create the 
signature

New approaches:
- automatic signature generation
- machine learning


-----------------------------Karl's Research-----------------------------

 ******************************
   Format-String Vulnerabilities       
 ******************************
Format-String Vulnerabilities

In C, we should write printf(%s, string)

Common error: printf(string)
- Security hole, because string can contain %s, %n, etc.

We can now analyze source code to find such 

bugs or verify there are none

Do this on all Debian Linux (8000 packages)

Research project w/ David Wagner


 ******************************
 Machine learning on novel worms 
 ******************************

Machine learning on novel worms

Mine data on emails => features
- Number of To: addresses, type of attachments, etc.

Based on features, classify an email as viral or 
non-viral
- parametric classifier (Naive Bayes)
- novelty detector (Support Vector Machine)

Automatically detect previously unseen worms, 
block sender

Research w/ Steve Martin, Anil Sewani, Blaine 
Nelson, Anthony Joseph


Recommended 
+CS161 (CS194 next semester) , already full as of 4/25
- Wagner, Joseph, Tygar
- www.CS161.org

+Cuckoo's Egg by Cliff Stoll


 ****************
  Virtual Machine 
 ****************
 
 A self-contained operating environment that behaves as if it is a separate computer.
 
 					 _______________________
	                                |       Bare Machine    |
					| ______________________|
					|____  Privileged_______|

 				           Software Nucleus
 				                ^      ^
				                |      |
					        |      |
						|      |
						|      |
						|      |
    ___________________				|      |			_______________________
   |                  |                         |      |			|		      |	
   | User Program,U1  |-------------------------       -------------------------|   User Program, U2  |
   |__________________|								|_____________________|


End Of Lecture