Virtual Machines and Managed Runtimes

Mario Wolczko (Oracle)
Patrick Li (EECS)
Professor Jonathan Bachrach (EECS)
EECS, UC Berkeley

CS294-113
Fall 2015
Fri 2-5p
Soda 320

4 Units: 3 hours lecture and 3+ hours lab time per week

Prerequisites

Students are required to have a strong background in systems programming in C and machine-level operation (assembly and machine code), and a working knowledge of Java (CS61A, CS61B, CS61C). Basic knowledge of compiler internals (CS164) are recommended but not required.

Description

The widespread adoption of FORTRAN in the 1950s and 1960s resulted in a plethora of high-level programming languages directly compiled to machine code, some of which still thrive (e.g. FORTRAN itself, as well as C and C++). However, in the 1970s and 1980s a different approach to execution gained in popularity, in which a layer of software continually intermediates between the high-level program and the machine. Most often called Virtual Machines, this approach initially gained popularity with the Pascal P-machines, Smalltalk-80's bytecode machine, and gained a huge boost with the emergence of Java and the JVM in the mid-1990s. Though Virtual Machines now dominate high-level language implementation, they have a reputation for being many orders of magnitude slower than traditionally compiled languages.

However, when coupled with dynamic compilation techniques, Virtual Machines can provide performance comparable to direct compilation while offering machine independent binary distribution, advanced memory management, better security, interactive program development and many other advantages. The objective of this seminar is to explore the design and construction of virtual machines by studying the history of the field, analyzing landmark systems and by hands-on construction and modification.

The presentation will take a mostly chronological approach, starting with early techniques and progressing through to the state of the art. Each week we will learn about the preeminent problems of a given era and how those problems were overcome. In the labs we will reprise some of these accomplishments through a graded series of exercises in which we build components of a virtual machine for an invented language. The initial exercises will implement basic techniques in C and C++; later we will switch to a virtual machine framework known as Truffle/Graal which will provide sophisticated components that we can assemble and customize into a larger system.

Lectures

Topics covered will include:

Readings

Weekly reading will be assigned. Some papers will be key for continued understanding of the lectures; for these there will be class discussions, and possibly written or verbal summaries required.

Labs

The labs will consist of a series of construction/modification exercises in which components of a VM are implemented or enhanced, to reinforce the corresponding lecture material. Many exercises will be in C; some in a constrained subset of C++, and some in Java. Labs will be done either solo, or in pairs.