--------------------------------------------------------------------------------
 - LECTURE NOTES - 4/30 - CS162 - 
--------------------------------------------------------------------------------

VIRTUAL MACHINES:
In order to make a Virtual Machine work, the underlying archetecture must cause
a trap if the underlying VM is affected.

This is hard: Does not work for all processors.

X86:
	-A process on the x86 archetecture can determine that it is not running
	in priveledged state.
		-the CPL(current privledge level) is stored in the low two bits
		of the code segment selector (%cs)
	-since some supervisor state instructions can be run in user, they won't
	trap if they are run in user state.
		-EX: popf() may change both ALU flags / System flags when run in
		privleged state.  In user state, only changes ALU flags and does
		not trap.
	
Solution:
	-Translate the code at runtime.  If an instruction needs special handling
	to make it work in the VM, insert code to make the instruction work
	correctly.


--------------------------------------------------------------------------------
 - TOPIC - PERFORMANCE EVALUATION -
--------------------------------------------------------------------------------

Suggested Reading: "Computer Performance Evaluation Methodology" by Heidelberger
								and Lavenberg

Four Areas of Performance Evaluation:
-Measurement
-Analytical Modeling
-Simulation Modeling
-Tuning

In order to provide useful and meaningful evaluation:
-Must do performance modeling and measurement throughout the entire life cycle of
a system.  This includes: design, debugging, installation, tuning, data collection
for the next system.
-Must have good substantive understanding of the system under study.  Can't just
use outside theory.

Hardware Monitoring:
-Use a hardware monitor to collect / reduce / condense data.
	-Data is pulled from system using logic probes
	-Cannot pull data where not available.  Have to design points to pull the
	data from(probe points).  Cannot just read data from middle of chip.
	-Signals can be used to count events, generate traces.
-Most hardware monitors have combination of hardware(eg. counters/gates) / software
-Hardware Moniters are expensive and hard to use.  Only really used by vendors
and large computer centers.

Software Monitoring:
-Can build software into system to monitor.
	-Implement Counters/Signallers into OS.  This can be used to sample at timer
	interrupts.
		-Can't sample something that isn't interruptable
-Some compilers provide built in profiling code.  Can be compiled into desired
program.
-Software Monitoring generates massive overhead.  20% or more.  Upwards of 50%.

Hardware Counters:
-Modern CPU's have counters build into hardware.  These counters can be used to
monitor things like: branches, misses, cycles, instructions, etc...

EXAMPLES:
Multics Measurements:
-Measurements taken using timer interrupts.
-Built in hardware counter kept track of memory cycles.
-Used an external I/O channel to probe/read memory.
-Remote terminal emulator to drive system.  

Diamond:
-Internal DEC measurement tool
-Hybrid Monitor.
	-Hardware probe used to read the PC, CPU mode, channel / I/O device activity
	System task ID.
	-Software monitor reads user ID
-This data from the hybrid monitor is then funneled into a minicomputer which reads
the data and does realtime analysis.  Can be used to generate traces.

IBM GTF(General Trace Facility):
-This tracks and records system events:
	-I/O interrupts, SIO's, opens, traps, SVC's, dispatches, closes, etc.
-Useful for debugging.  Since it tracks so much it returns a lot of useless data.

IBM SMF(System Management Facility):
-This tracks system events.
	-Jobs, tasks, opens, closes, etc.
-Useful for accounting / management.  This also tracks a lot of different events which
causes it to return a lot of useless/extra data.

-Other mainframes have console monitors.
	-Generate realtime load information/measurements
	-Queue lengths, channel utilization, etc.

Workload Characterization:
-If you want to study the workload you must know what the workload is.
-Three Types:
	-Live Workloads: Uncontrolled measurement.  Useful for measuring system, bad for
	experiments.
	-Executable Workloads: Real samples.  Measure the system using an executable pre-
	determined workload.
	-Synthetic Executable Workloads: Parameterized version of real workloads.  This may
	be used as a prediction of a future workload and can be used to see how the system
	would react.
-Characterizing a workload:
	1. Decide what to look at. (must choose wisely, can't focus on unimportant aspects)
	2. Determine how to characterize the chosen workload.
	3. Determine how to get the data you need.
	4. Use this method to get your data.
	5. Analyze your data.
-Statistical Methods:
	-Mean, variances, distributions, linear regression, factor analysis
	-Use statistical analysis to look for patterns that relate to previously known 
	models.

Analytic Performance Modeling:
1. Contruct an analytical model of the system of interest.
2. Calculate factors of interest as f(parameters).
-Lots of research is done this way, not always useful.
-The models are usually queueing models, stochastic process models.
	-Progress in queueing theory has been made mostly in the last 30-40 years due to
	advanced computer system modeling.
	-The queueing network models can be easily solved by efficient computational
	algorithms.  Systems that cannot be easily solved have good known approximation
	methods.
-Pros:
	-Good for capacity planning, I/O subsystem modling, preliminary design aid.
		-Widely used in capacity planning.  Measure/Analyze the current system.
		Use this analysis and projections of future workload to craft next gen of
		the system.
-Cons:
	-Analytic models do not capture fine level of detail needed for hardware design and
	analysis.
-Queueing Networks: A powerful analytic modeling technique.
	-Components of a Queueing Network: Servers, Customers, Routing.
	-Many types of queueing networks can be easily solved.
	-EXAMPLE: BCMP - Baskett, Chandy, Muntz, and Palacios
		-Customers have a type T.
		-System routing uses form P(i,t1,j,t2)
			-i,j are servers
			-t1,t2 are types
		-Server Types:
			-FCFS exponential.  Service rate = f(queue length).
			-Processor Sharing
			-Infinite server
			-LCFSPR(Last come, first server, preemptive resume)
		-Solution = product of terms for each service station.
	-If network != BCMP type the network probably cannot be solved exactly.  Use an
	approximation.

SIMULATION:
Types: discrete event simulation, continuous simulation, monte carlo methods(sampling)
Components: 
	-Model of the system which contains the state of the system.
	-Set of events which lead to changes in the state of the system.
	-Method to generate event sequences.
	-Measurement component.  records interesting statistics during simulation.
Discrete Event General Simulation Model:
An event list is used.  An event is pulled off the list which leads to a change in the state of
the system.  In this new state the event list is updated and statistics are taken.  Next event is
pulled from the list.  Repeat.

-Events can come from either a trace and/or RNG.
-Simulations use special languages: GPSS, SIMSCRIPT, GASP, SLAM, SIMULA
-Simulation Modeling Packages: RESQ (IBM), PAWS (UT Austin), QNAP (INRIA - Potier)

Analysis of simulation output:
-Regenerative Simulation - Find regeneration point.  Use this to compute standard IDD statistics.
-Time Series Analysis - AKA spectral method
-Repeat simulation

Back of the Envelope Methods:
example: calculate the amount of water flowing down the mississippi river.