==============================================================================
Lecture on 2005-05-02

Farhad Massoudi cs162-db
James McBryan   cs162-bj
==============================================================================
Announcement:The last midterm is 7pm, May 9, 10 Evans. It will probably be 
             about 2 hours long. No class on Monday, May 9. 

Topic: Performance Evaluation

Performance evaluation covers these areas:
 +   Measurement
 +   Analytic Modeling
 +   Simulation Modeling
 +   Tuning
 +   Design Improvement

****************************************
Measurement (continue from last lecture):
****************************************

Hardware Counters 
       * Modern CPUs like Pentium processors have counters built into the
              hardware. Hardware Counters can be used to count various 
              things such as: branches, misses, cycles, instructions, etc.

Multics System
       * Employs timer interrupts.
       * Timer went off randomly to get samples
       * Used a hardware counters to count memory cycles
       * Had external I/O channel which could be externally driven
              to do useful things like read certain regions of memory.
       * Used remote terminal emulator in order to reproduce workloads
              (instead of humans)
       * Multics believed initially to be able to timeshare 200 users
              First boot took 9 minutes for the first echo and never ran 
              more than 20 users
       
- Smith: If one is to build a whole new system, it is very hard to figure the 
  performance.

Diamond
       * Diamond is an internal DEC measurement tool  
       * Digital Equipment Corporation (DEC) bought by CPQ bought by HP
       * Had something called diamond and hardware probes that would have
              something to collect all the info
       * Diamond is a hybrid monitor. Hardware probes read the PC, CPU 
              mode, channel and I/O device activity, and system task ID.
              Software gets the user ID. A minicomputer reads the data and
              does real time and later analysis. 
       * Diamond can also generate traces.

IBM has two products: GTF & SMF:
   - IBM's GTF (General Trace Facility)
       * GTF was designed for debugging
       * Involves no hardware counters
       * this system generates trace records of any system events 
         - e.g.:
         I/O interrupts, SIOs, opens, traps, SVCs, dispatches, closes, etc.
       * GTF generates lots of useless data, and not as much useful 
              data as you would like.
       * Can take up to 1/3 os CPU time which is very bad.
       * Smith: I don't know of any I/O device avaialbe on today's
              market that can absorb this much data rate. 

   - IBM's SMF (System Management Facility)
       * SMF was designed for accounting and  management
       * Generates records for assorted events 
          -e.g.:
          jobs, tasks, opens, closes, etc.
       *SMF also generates lots of useless data, and not as much useful
              data as you would like.
   - IBM’s combination of GTF and SMF was most optimal

   - Some mainframes have console monitors which generate real
       time load information and measurements (e.g. queue lengths,
       channel utilization, etc.)

*******************
How to use tracing
*******************

Workload Characterization
       * Designers always want to know what the work load of any computer is.
       * There are 3 types of workloads for Performance Experiments:
              1)Live workloads: real people doing real stuff at real times
                     * It is good for measurement but poor for experiments
                       ie, uncontrolled.
                     * Not reproducable.
              2)Executable workloads consisting of real samples.
                     * able to do something to be replayed day after day
                     * This is very useful.
              3)Synthetic executable workloads.
                     * Paramitized versions of a workload
                     * Make up a set of data. If you gussed right you get good 
                            results, otherwise it is meaningless. 
                     * A synthetic workload may be needed as a projection of 
                            a future workload.


To characterize a workload
       * Decide what to characterize.  (not easy - as you can see,
              many published papers are not interesting)
       * Figure out how to characterize those items
       * Figure out how to get the data (!!!)
       * Get the data.
       * Exploratory data analysis.
       * Cluster analysis: cluster data into sets and analize them
              - Smith: For example, there was an experiment that tried to find
                     the human characteristics based on their zip code!!!
              - Adrian: Was that done by the Census? :D

Statistical Methods
       * things to do with data
       * Means, variances, distributions Techniques such as linear regression
              and factor analysis quite useful.
       * Can do statistical analysis on data to see if various models fit.
       * Seldom used - little intersection between class of competant  
              experimental performance analysts and competent statisticians.

Analytic Performance Modeling
       * To be able to build some analytic model
       * Calculate factors of interests as function of parameters
       * Much research along this type
       * Models tend to be queueing models, stochastic process models.

Pros/Cons of Analytic Modeling
       * They are good for capacity planning, I/O subsystem modeling,
              preliminary design aid.
       * Capacity planning is a big area of application
              measure and analyze current system, set up validated
                  model of current system and workload, project changes
                  in workload, and see what sort of system design will
                  handle it.
       * Analytic models do not capture the fine level of detail
              needed for some things such as hardware design and
              analysis - e.g. caches, CPU pipelines.

Queueing network models 
       * Queueing networks are powerful technique in analytic modeling and 
           used for:
          - Capacity planning
          - Modeling of I/O systems
          - Back of the envelop designs

       * Major Components of Queueing Network are:
           Servers
           Customers
           Routing

       Diagram of queueing network:

        +---------------------+
        |   Q-1               |
        |  -----+             |
        +-->||  | server-1    |
 ---------->||  |-[]--+       |
        +-->||  |     |       |
        |  -----+     |       |
        |             |       |
        +-------------+       |
        |   Q-2               |
        |  -----+             |
        +-->||  | server-2    |
            ||  |-[]------>   |
        +-->||  |             |
        |  -----+             |
        |                     |
        |                     |
        |   Q-3               |
        |  -----+             |
        |   ||  | server-3    |
        +---||  |-[]----------+
            ||  |             
           -----+
            
       * Customers have a type T
       * Routing  is of form p(u,t1,j,t2) where i and j are servers and t1
             and t2 are types
       * Servers are of type
              - FCFS exponential, where service rate is a function of 
                    queue length
              - Processor sharing, CPU with time-sharing
              - Infinite server, all customers being served at shame rate
              - LCFSPR - last come, first serve, preemptive resume
       If network is not of type BCMP, then there are other equations
       (Professor said likely not to be on the midterm)

*************
SIMULATIONS
*************

Simulation Types:
           * discrete event simulation
               Trace Driven
               Random Number (Stochastic):
                     Use it for something like in Boeing to analyze wind
           * continuous simulation (e.g. a diff eq) (not used  for
                  computer system modeling)
           * Monte carlo methods (e.g. sampling)
                     Throwing darts and using fractions to determine the area

Simulation model has these components:
       A model of the system, which has a state.
       A set of events which cause changes in the state.
       A method for generating such a sequence of events.
       A measurement component, which records the statistics of
              interest.

Discrete Event general simulation model:
       Events come from  event  list.   (Events  can  come  from
              trace).
       Next event is taken off list.
       System state is updated.
       Event list is  updated.  (events  added,  deleted,  their
              times changed)
       Statistics are accumulated.
       Next event is obtained from event list.
   Example: simple discrete event simulation of M/G/1 queue.
   Events come from trace and/or random number generator.

MGI queue
       M - Markovian exponential arrival time
       G - General anything
       1 - server

Special languages for simulation:
       GPSS,SIMSCRIPT,GASP, SLAM, SIMULA

SIMULA has automatic queuing with states which allows to write efficiently 
       but runs slowly

There are simulation modeling packages: 
       RESQ (IBM), PAWS  (UT Austin), QNAP (INRIA - Potier)

Analysis of simulation output
       * Regenerating simulation by finding the regeneration point where the
              computer returns to a common loop
       * Time series Analysis by using statistical frequency domain
       * Repeat Entire Process
       * Run forever and find the convergence
       * Batch means, stop and take averages, and compute averages.

Easy to fool yourself
       Drinking wine is good for you
              => Drinking beer is good for you.. not!
              Did not include some demographics in studies

Back of the envelope calculations (do calculations on the back of a envelope)
i.e.
How much water flows down the Mississippi?
    Must generate algorithm in different ways with different parameters 
i.e.
 How much damage is there in a fire?
              Area of damage is 3 square miles
       Average replacement cost of each is 200 k
       3 houses per acre
       640 acres of miles
       total = 3*3*640*200 = $12 billion
       So total was actually 1.5-2 billion

       (most likely to have an exam question on this, look at reader)


************************************************
Topic:  Current Research in Operating Systems
************************************************
  Most of what we have talked about in the  area  of  Operating
          Systems is not new, but goes back 20-30 years.
   What are people doing currently?

   In a recent Operating Systems Conf. Proceedings (Proc.  17'th
          ACM  Symposium  on  Operating  Systems  Principles, December,
          1999), the principal topics include:
       Managability, Availability and Performance in a mail ser-
              vice.
       Performance of Web Proxy Caching
       Performance of a stateless, thin-client architecture
       Energy aware adaptation for mobile environments.
       Active Networks (customized programs are executed  within
              the network).
       Building reliable, high  performance  communication  sys-
              tems.
       File system usage in Windows NT.
       The Elephant file system.
       File system security.
       Integrating segmentation and paging protection.
       Resource management on shared-memory multiprocessors.
       A fast capability based operating system.
       A naming system for dynamic networks and mobile units.
       A distributed virtual machine for networked computers.
       A modular router.
       Timer support for network processing.
       CPU priority scheduling.
       Scheduling for latency sensitive threads.
       A small, real-time microkernel.


   In a recent Operating Systems Conf. Proceedings (Proc.  16'th
          ACM  Symposium  on  Operating  Systems  Principles,  October,
          1997), the principal topics include:
       Performance Analysis - profiling and distributed/parallel
              programs
       OS Kernels
       Caching in Computer Networks
       Transactions on Networks
       Security for Java
       Formal Analysis of Security
       Running Commodity OS on Scalable Microprocessors
       Transparent Distributed Shared Memory
       Scheduler for Multimedia Applications
       CPU Scheduling
       Scalable Distributed File System
       Log Structured File System
       Reducing I/O latency
       File Caching and Hoarding in Mobile Systems
       Update Policies for Mobile Operation

   Some titles fromn 1993 SOSP:
       Distributed file systems
       RAID type file systems
       Synchronization and its limitations in  distributed  sys-
              tems.
       Distributed system design.
       Distributed programming
       Using threads.
       Memory Management of an object oriented language
       Relation between operating system  structure  and  memory
              system performance (effects on the cache of OS code).
       Concurrent compacting garbage collection.
       Improved IPC (interprocess communcation)
       Improved fault isolation.
       Audio and video in a distributed system.
       Authentication
       Location info for distributed systems.

   From ASPLOS (Architectural Support for Programming  Languages
          and Operating Systems), 10/94, related to OS:
       Data and control transfer in distributed systems
       Scheduling and page migration for multiprocessor  compute
              servers
       Synchronization algorithms for multiprocessors
       Software overhead in message passing.
       Software support for exception handling.
       Performance monitoring.

   In summary:
       Networks and the Web
       Performance Issues- memory, scheduling, networks.
       Mobility
       Energy Management
       File Systems
       Protection and Security
       Misc: virtual machines, kernels.

   Personal View of Important Issues:
       The world is becoming one large distributed computer system
              with file migration, process migration, load balanc-
              ing, distributed transparent file system, etc.
       This suggests that the important issues are:
           Efficient ways to write reliable OS
           With high performance
               file migration algorithms
               load balancing
               distributed transparent file  system  implementation
               wireless and mobile systems
           Supporting mobility
               Location and naming issues
               Energy management

   Real Time Systems are Systems in which there is a  real  time
          DEADLINE
       Typically a mechanical system is being controlled.
           E.g. assembly line, anti-balistic missile defense

   Real Time System must be able to:
       Meet all Deadlines (with 100% or 99+% probability)
       Handle the aggregate load.
           If there are N events per second,  must  be  able  to
                  handle them, whether or not each has a deadline.

   This implies:
       Deadline Scheduler
       Avoidance of page faults - generally must  lock  deadline
              oriented code into memory.
       Avoidance of I/O operations when near deadline  is  pend-
              ing.    Usually  must  keep  necessary info in electronic
              memory, and/or fetch it in advance.

   This does not imply no cache memory
       No matter what people say....
       Better to have system that somet
=================
Summary of above:
=================

Current research in OS’s
       Manageability
       Performance of Web Proxy Caching
       Thin Client
       Energy aware adaption for mobile environment
       active network
       building reliable high performance communication system
       file system usage in windows nt
       elephant file system
       file system security
       integrating segmentation and paging protection
       resource management on shared memory multiprocessors
       a fast capability based operating system
       naming system for dynamic network and mobile units
       a distributed virtual machine
       ...
       distributed systems

OS Research in summary:
       networks and the web
       performance issues - memory, scheduling, networks
              killed and done with
       mobility
       energy management
              battery has energy equivalent to nitro glycerin
       file systems
              need to put your data somewhere
       protection and security
              no more stealing
       virtual machine and kernels
              coming back!

Professor Smith’s personal view of important issues
       world is becoming one large distribute computer system with file migration, process migration
       high performance
       file migration algorithms
       load balancing
       distributed systems
       wireless and mobile systems

       In order to support mobility
              location and naming issues
              energy management
              security