[Fa19] CS294-73

Software Engineering for Scientific Computing
UC Berkeley

Instructor

Prof. Phillip Colella
colella@eecs.berkeley.edu

Office Hour:
5-6 pm, Tuesday & Thursday, 347 Soda Hall

Teaching Assistant (GSI)

Colin Wahl
colin_wahl@berkeley.edu

Office Hour:
5-6 pm, Tuesday & Thursday, 347 Soda Hall

Course Description

This course will focus on software development skills and tools that cross most disciplines, and how they are applied towards larger software development goals. The course will be taught in the context of several dominant algorithmic patterns found in scientific computing: dense and sparse linear algebra, structured and unstructured grid methods, particle methods, and fast Fourier transforms (FFTs). The course will be taught using C++, although no prior experience with the language will be assumed. Specific topics will include: working in a typical Unix environment, Makefiles, compilers, revision control systems, debuggers, profilers. Fundamentals of C++: Functions, pointers, references; scoping; classes, encapsulation, inheritance; templates; memory management. Fundamentals of data structures and algorithms as they arise in scientific computing. Asymptotic analysis of algorithmic complexity. Correctness debugging and performance debugging. Engineering larger systems: factoring, reuse, dependencies, composition, use of external libraries.

Reading List and Resources

Code Complete: A Practical Handbook of Software Construction
http://www.amazon.com/Code-Complete-Practical-Handbook-Construction/dp/0735619670
O’Reilly GNU Make
Stroustrup C++ Language Reference
http://docstore.mik.ua/orelly/unix/lrnunix/
http://www.ee.surrey.ac.uk/Teaching/Unix/

Grading

There will be 5-6 homework (programming) assignments, adding up to 60% of the final grade. The final project is worth 40% of the grade. There will be no final exam.

Prerequisites

The students will have had to have taken an advanced undergraduate course in an area in which simulation is commonly used and at least one of the algorithmic patterns listed above arises; or the consent of the instructor. This course is not open to graduate students in computer science.

Course requirements

Each student to submit all homework assignments, discussion postings for grading, and complete a final project. A laptop, workstation, or access to a UNIX style system is required, as is installation of an Emacs editor.

Learning objectives for this course

A. Develop skills and tools that cross most disciplines, and how they are applied towards larger software development goals.
B. Understanding of dominant algorithmic patterns found in scientific computing: dense and sparse linear algebra, structured and unstructured grid methods, particle methods, fast Fourier transforms (FFT).
C. Students will learn fundamentals of data structures as they arise in scientific computing, asymptotic analysis of algorithmic complexity.
D. Students will be taught C++, a widely-used programming language in scientific computing; how to develop code in a typical Unix environment; procedures for correctness debugging and performance debugging, and the engineering larger systems.

Class Schedule

Week Content Lecture Slides
1 Course introduction. Algorithmic patterns for scientific computing.
Compiled vs interpreted programming environments.
C/C++, compilers, objects, libraries, executables.
Getting your tools installed: g++ / clang, gmake, git, gdb/lldb.
Lecture1
2 C++ key language features: functions, pointers, pass-by-reference, pass-by-value, return value.
Command-line options, argv, argc. Object-oriented programming;
classes, encapsulation, inheritance. Templates. Memory management.
The heap and the stack. Data basics; managing scope; globals;
class basics; encapsulation; rational default behavior
.
Lecture2
Lecture3
3 The structured grid pattern. data+methods.
Stencil operations. Build management using GMake. Revision control using git.
Debugging. Compiler errors, warnings;
The various uses of comments. Intro to Doxygen.
Lecture4
Lecture5
4 Coding standards (including the coding standard for this course).
Poisson's Equation and Point Jacobi
More on STL containers: vector. Intro to shared pointers.
Unstructured grid pattern: finite element class library.
More on GMake. STL vector.
C++ standard template library (STL): STL pointers, arrays.
Lecture6
Lecture7
5 More FEM and introduction to sparse matrices.
Dense linear algebra. Simple models of memory and introduction to tuning.
Lecture8
Lecture9
6 FFT and inheritance.
Introduction to recursion.
Using third-party packages (FFTW, LAPACK, BLAS).
More on STL containers: map, list, etc.
Lecture10
Lecture11
7 Particle pattern. N-body (global and local), particle-in-cell, particle-particle particle-mesh algorithms.
Fast discrete convolution using FFTs and Hockney’s method.
Lecture12
8 Introduction to profiling and simple optimization:
instrumenting profilers, sampling profilers. Call tree. hot-spots and bottlenecks.
The harder optimizations: vector instructions, data locality, auto-tuning, and why you don’t want to do these yourself if you can help it (“buy or build”).
RAII and the Rule of 0.
Lecture13
Lecture14
9 Managing project complexity and team development.
Revision control systems revisited. Headers, prototypes.
More on Doxygen. Unit testing, test targets. Making libraries.
triangular dependencies, insulation. Project scope and suggestions.
Final project process: submission to class git repo;
Lecture15
Lecture16
10 Structured-grid methods revisited. Multigrid for structured-grid discretizations of Poisson’s equation.
Stencil operations and SIMD
Lecture17
11 Multigrid performance measurements.
DFTs and optimization.
Lecture18
Lecture19
12 More on Hockney's method and discrete convolutions. A mathematicians view of PIC.
Revised final projects schedule
13 Embedded Boundary Methods
14 And all the world is parallel: introduction to programming models, memory models, execution models, and how they interact with the algorithmic patterns.
Where to go for more (CS267). Case study: HPGMG - a high-performance multigrid benchmark.

Homeworks

Assignment Due Date
Programming assignment 1(see Lecture 6):
Forward-Euler diffusion problem on a structured grid.
I/O, visualization for structured grids.
Sep 25, 2019
23:59
Programming assignment 2:
Solving Poisson’s equation on an unstructured grid using point Jacobi.
Oct 10, 2019
23:59
Programming assignment 3:
Comparing naive implementations of FFT, DGEMM to those written by the pros.
Oct 29, 2019
23:59
Programming assignment 4:
PIC and PPPM in two dimensions.
Nov 7, 2019
23:59
Programming assignment 5:
Improving the performance of Multigrid.
Nov 21, 2019
23:59

Final Project

Milestone Due Date
Milestone 1:
Identification of the teams, and of the topic for the project. A mathematical description of the problem being solved. This can be in the form of a research article, combined with a concise summary.
Nov 12, 2019
23:59
Milestone 2:
A complete mathematical specification of the discretization methods and / or numerical algorithms to be used to solve the problem. A specification of what will constitute a computational solution to this problem.
Nov 21, 2019
23:59
Milestone 3:
A top-level design of the software used to solve this problem, in the form of header files for the major classes, the mapping of those classes to the algorithm specification given above, and the specification of a testing process for each class.
Dec 3, 2019
23:59
Final Project Report Due Dec 19, 2019
23:59