Lecture Notes for Monday, April 18
Varun Patel   cs162-co
Alex Kan      cs162-cp

Topic: Protection and Security (aka the "How to Steal a Trillion
       Dollars" lecture)

Announcements
=============
+ Exams back today (as of 3:45 today)
  Q: What was the average grade on the midterm?
  A: Too high!

  Specifically:
  (out of 108 points)
  - Average: 73.79
  - Median: 74
  - Standard deviation: 11.1
  - Max: 100
  - Histogram:
       0-50 *
      51-55 ***
      56-60 ***
      61-65 ********
      66-70 ***********
      71-75 *****************
      76-80 **********
      81-85 **********
      86-90 **
      91-95 ***
     96-100 ***
    101-108
  - Grading:
    1 Adrian
    2-3 Karl
    4-5 Adrian
    6-7 Prof. Smith

+ Last midterm is 3 weeks from today
+ Last real lecture 2 weeks from Wed (evening of May 9th)

Lecture Notes (continued from previous lecture)
===============================================

+ Forms of identification
  (see previous lecture notes for more information on identification
  - We must protect the authorizer -- how do we do this?
    + Smith's physical analogy: authorizer is like a guard at a gate
      -- he must not be bribed, shot, or replaced
    + One of the easiest ways to break in is to replace the password
      mechanism -- therefore, the program that checks passwords for
      matches or encrypts passwords must be incorruptible
  - Another form of identification: badge/key
    + properties of badge/key
      - does not have to be kept secret
      - should not be forgeable or easily copied
      	+ for real-life keys, need limited-distribution key blank to
      	  really prevent copying (by unscrupulous locksmiths)
      	+ key paradox: ideally, keys should be cheap to make, but hard
      	  to duplicate, so there needs to be some trick to their
      	  production
	  Q: What do you mean by trick or secret?
	  A: In a computer, should be something you can't make in your
  	     basement, like a chip with a bit pattern, which requires
  	     either a foundry or an EEPROM programmer
    	+ can be stolen, but the owner should know when it is
    	+ need to be carried around, which is a pain
    + examples: credit cards - can be duplicated, but need blank cards
      and an embosser (unless you're a foreign intelligence service)
      - for reference, the Chinese intelligence service is called the
      	Ministry of State Security (MSS), or Guojia Anquan Bu
  	(Guoanbu) in Chinese
    + biometrics -- important in the future (see Gattaca)
      - Smith: These are very likely to become the dominant thing in
      	the next 10 years!
      - convenient, hard to forge, and reliable
      - Q: You mentioned this last lecture, didn't you?
      	A: Yeah, I have a tendency to repeat myself.
      - right now: fingerprint readers (available on IBM laptops)
      	+ resistant to false negatives and false positives
      	+ didn't work well when Microsoft tried them as a secondary
  	  thing -- burdensome to carry around an extra item
	+ Q: Are fingerprint readers practical?
	  A: They can't be more than a couple hundred extra dollars...
	  Q: What happens if someone burns their finger?
	  A: Use several fingers to match, or have some sort of
	     override
	  Q: If someone has an imprint of your finger, what do you do?
	  A: Hopefully, the reader checks temperature.
	  Q: Is temperature a factor?
	  A: I would hope so, otherwise they won't be reliable
	  Q: Don't your fingerprints wind up on the reader?
	  A: ...
	  Q: Will they come out?
	  A: Yes they will -- and they will suck at first. Then, they
  	     will either 1) get better and make lots of money, 2) stay
  	     the same and make lots of money, or 3) people will ignore
  	     them.
      - later: retinal scanners? (see Minority Report)
      - alternate possibility: chip embedded in your skin, like
      	domestic animals
      - We still need some sort of backup method for authorization in
      	case biometrics cannot be used (cut/lost fingers, broken
  	scanners, etc.)
      - Q: Do biometric readers offer added security, or do they offer
      	   convenience?
	A: Both!
  - Once identification is complete, the system must protect the
    identity, since other parts of the system rely on it.
    - must make sure that process associated with a given user stays
      associated with that user (e.g. no forging mail)

+ Authorization determination - indicates who is allowed to do what
  with what
  - in general, represented as an access matrix: 1 row per users, 1
    column per file
    + each entry indicates user's privileges on that object
    + too bulky to do this in practice
      - Smith's estimate: CS department has ~1500 user accounts,
        millions of files
      - better ways to do this?
  - Access lists
    + for each file, indicate which users are allowed to perform which
      operations
      - store as a list of user/privilege pairs (this is just a
      	vector)
      - might require a bit of overhead to walk through access list
      	+ 50-100 instructions if no I/O
	+ millions of instructions if I/O is required -> milliseconds
	+ To combat this, check only on an open, rather than a read or
  	  write, operating on the assumption that permission don't
  	  change that often
	  Q: So once I open a file, no one can revoke my access?
	  A: I can still kill your process, or the superuser can
	     change bits in the open file table or clobber your
  	     pointers.
      - to save space, group users into classes
      	+ in UNIX: self, group, anybody else (world); read, write,
  	  execute bits -> 9 bits per file
	+ in NTFS: can set different permissions for as many
  	  users/groups as you want
    + pretty standard across operating systems
    + easy to determine and give/revoke access
    + hard to tell what a given user can access
    + equivalent to a guard at the gate -- checks your name against a
      list
  - Capabilities
    + hard to envision -- like a keyring
      - object-privilege pairs (capability list, aka C-List)
      - many be able to use a capability as a name for something
      	+ mapped by the system to an object, so you can't name things
  	  that aren't referred to
	+ page table entries = capabilities for pages
	+ open page table entries = capabilities for files
    + easy to tell what you are allowed to do
    + may or may not be hard to tell what has access to something
      - capabilities can usually be passed along via some system call
    + may or may not be hard to revoke access, depending on whether or
      not capabilities are segregated into C-lists
      - how do you revoke my capabilities if they're stored on a USB
      	key in my pocket? (besides with a pickpocket)
      - someone's PhD thesis: make capabilities indirect through a
      	well-known table, and zap the indirect pointer
    + implementation issues -- how to make sure they can't be forged?
      - tagged architecture
      	+ Each capability has a tag, which can only be set by the
  	  system. User can manipulate capabilities, but not the tag.
      - segregated architecture
      	+ Capabilities are segregated, and are only touched by the
	  system. User refers to them indirectly (e.g by C-list).
    + real-life examples
      - Intel 432 (research project)
      - Cambridge CAP System (research project)
      - IBM System/38 -- storytime!
      	+ IBM wanted to make a new system called FS (for Future
  	  System)
	+ wasn't compatible with past systems
	+ project dropped, but the guys that worked on FS turned it
  	  into a minicomputer (the System/38)
	+ turned into a turnkey system for small businesses
	+ sold as a small business line, most profitable part of IBM's
  	  line for many years, but not anymore
    + in general -- great security, slow/incompatible architectures
      - low overhead - can use capability for every access
      	+ not zero overhead, however -- capability-based systems tend
	  to be slow
      - nice failure properties in a pure capability system
      - secure, but difficult to share information, since no one has
      	access unless it is explicitly given
    + Multics: 8 levels (rings) of protection
                        ___________________
                       /  _______________  \  \  \  \
                      /  /  ___________  \  \  \  \  \
                     /  /  /  _______  \  \  \  \  \  \
                    /  /  /  /  ___  \  \  \  \  \  \  \
                   /  /  /  /  / 0 \1 \2 \3 \4 \5 \6 \7 \
                  |  |  |  |  |     |  |  |  |  |  |  |  |
                   \  \  \  \  \___/  /  /  /  /  /  /  /
                    \  \  \  \_______/  /  /  /  /  /  /
                     \  \  \___________/  /  /  /  /  /
                      \  \_______________/  /  /  /  /
                       \___________________/  /  /  /

      - more storytime!
      	+ idea of all-or-nothing security had been previously
	  criticized
	+ in the 60s, MIT thought it could do better with many levels
  	  of protection
        + originally meant to be 64 levels: 0 (highest/kernel) to 63
  	  (can't touch anything at all)
	+ original consortium consisted of MIT, Bell Labs, and GE
	  - Bell Labs bailed out (see UNIX, a tiny piece of Multics
  	    used for development, rather than data processing)
	  - MIT chose GE's computer over IBM 360 model 67
	  - IBM went on to do the Time Sharing System (TSS), but
  	    before that...
	    + Time Sharing Option (TSO): kluge that case created when
	      people wanted timesharing -- not very good, but it was
  	      there when people needed it, so it became popular
      - each file and segment has a level associated with read, write,
        execute, and call (special, since it can only go to certain
	entry points)
	+ on read/write/execute/call attempts: generate effective
  	  address of target, which has a level associated with it
  	  (highest ring that could have modified the address), check
  	  protection level of effective address checked against
  	  permitted access on every reference
	+ since protection levels are associated with segments, they
  	  appear in segment tables and can be accessed with little/no
  	  extra overhead
	+ processes can execute a higher level of authority if they
  	  were entered at a permitted entry point
      - worked reasonably well
      - good for setting up protected subsystems
      - X86 has a built-in 4-level scheme, so Multics protection
  	system is still in there to some degree
      - Q: What stopped people from writing programs to could be
           called at 7 but run at 0?
	A: You still needed permission to run something at permission
           0.

+ Access enforcement - some part of system must be responsible for
  enforcing access controls and protecting the authorization and
  identification information
  - invoked by everybody, including those planning to cause trouble
    + should be small and hard to break into/corrupt (e.g. code that
      modifies page table)
  - portion of system that enforces protection is security kernel
    + in most cases, whole OS runs in root mode, which means that
      systems aren't very secure
  - paradox - the more powerful the protection mechanism, the more
    complex must be the security kernel, and hence the more likely it
    is to have bugs!
  - protecting a computer system is extremely difficult in general --
    no completely secure civilian computer systems
  - common problems
    + abuse of valid privileges
      - superusers can do anything
      - privileges aren't fine-grained enough
    + impostors/Trojan horses
      - put in something that looks like something else: fake shells
      	(can steal people's passwords and remember for your owner),
      	fake C compilers, programs that steal information
      - example ("Superman III Scam"/"salami attack"/Office Space):
      	checking account system that credited fractional cents to the
      	account of its creator
      - Trojan Horse - get a legitimate user to unwittingly
        execute/utilize code set up by intruder
    + listeners
      - eavesdrop on terminal wire or local network traffic
      - becomes a problem with wireless
      - you can also get information off of people's CRTs with an
      	antenna
    + spoilers - use up all resources and make system crash
      - DoS attack (SYN flood)
      	+ doesn't need to crash computer, just need it to start
	  dropping packets
    + trap door - doctored version of standard program which gives
      special privileges to given person
  - examples of penetration
    + get on permission lists for /dev files
      - gives access to raw I/O devices
    + fake shell - capture password of users who try to log on
    + dial into a still live dial-up line
    + walk up to terminal that is still logged on
    + find account with null password (can tell from /etc/passwd) -
      like WiFi networks
    + fake distributions (version of software with doctored code)
      - mail your patch and get people to install your patch instead
      	of IBM's
    + fake file system, have system mount it
      - put program there owned by superuser, setuid bit set
      - run program, become superuser
    + page mode buffering
      - send commands to a terminal to store following characters and
      	then send them back
      - characters that are sent back interpreted as being from user
      	at that terminal
      - Q: Who thought of it?
      	A: I don't know, but it's a very serious problem.
    + log on through UUCP as remote system
      - smart terminals used to buffer characters
      - send commands to terminal to store characters and then send
        them back -- log on through uucp as remote system
    + buffer overflow
      - overflow argument buffers, since they don't test for length
      - string might overflow onto your code
  - example break-in at Stanford
    + guest acct had pw "guest"
    + guest was able to write a certain scratch file on root search
      path
    + root executed this and gave person root access
    + used root to take on identity of other users, found users with
      .rhost files to get onto other systems
    + repeat to move from system to system
    + Q: Was the process automated, or was it manual?
      A: I assume that it was automated.
  - Once a system has been penetrated, it may be impossible to secure
    it again
    + hard to figure out what intruder did -- might have left hooks
      around for imposter to regain control
  - may not always be possible to tell if the system has been broken
    into
    + villains can clean up traces behind themselves
    + Western Electric denied break-in even after someone admitted to
      it
  - can't be sure that the system is secure
    + bugs can provide loopholes in protection mechanisms
  - as of writing of Smith's lecture notes, $450 billion (* 10)
    transmitted per day through EFT
    + could be as much as 10 times as much by now
    + BIG target

+ The last 15 minutes of class were spent handing back midterms.