Machine learning has emerged to be a key approach to solving complex cognition and learning problems. Deep neural networks, in particular, have become pervasive due to their successes across a variety of applications, including computer vision, speech recognition, natural language processing, etc. While machine learning algorithms deliver impressive accuracy on many deployment scenarios, the complexity of the algorithms also poses a unique computational challenge to state-of-the-art hardware design.
To this end, this course is designed to help students come up to speed on various aspects of hardware for machine learning, including basics of deep learning, deep learning frameworks, hardware accelerators, co-optimization of algorithms and hardware, training and inference, support for state-of-the-art deep learning networks. In particular, this course is structured around building hardware prototypes for machine learning systems using state-of-the-art platforms (e.g., FPGAs and ASICs). It's also a seminar-style course so students are expected to present, discuss, and interact with research papers. At the end of the semester, students will present their work based on a class research project.
Readings | 10 % |
Labs | 40 % |
Projects | 50 % |
Course participation (e.g., Piazza, Gemmini/Chipyard code contributions) | 5% (extra credit) |
Week | Date | Lecture Topic | Readings | Paper Review (by Wednesday) | Labs/Projects |
---|---|---|---|---|---|
1 | 1/20 | Class Organization & Introduction slides recording | |||
2 | 1/25 | Introduction to DNNs slides recording | AlexNet, NeurIPS'2012 | [Submit your review here.] | Lab 1 (due 2/5, solution) |
1/27 | Introduction to DNNs 2 slides recording | ||||
3 | 2/1 | Quantization slides recording | Integer-Arithmetic-Only Inference, CVPR'2018 | [Submit your review here.] | |
2/3 | Kernel Computation slides recording | ||||
4 | 2/8 | Dataflow slides recording | cuDNN, arXiv'2014 | [Submit your review here.] | Lab 2 (due 2/19) |
2/10 | Accelerator slides recording | ||||
5 | 2/15 | Presidents' Day. No class! | TPU, ISCA'2017 | [Submit your review here.] | |
2/17 | Guest Lecture: The current state of Neural Network Quantization, Amir Gholami, UC Berkeley slides recording | ||||
6 | 2/22 | Chipyard/FireSim Overview and Setup, Abraham Gonzalez, UC Berkeley slides recording | FireSim, ISCA'2018 | [Submit your review here.] | Lab 3 (due 3/5) |
2/24 | Guest Lecture: Systolic Array and Tensorization: Key Components of A Deep-Learning Accelerator, Ron Diamant & Randy Huang, AWS slides | ||||
7 | 3/1 | Mapping slides recording | PHiPAC'ICS1997 | Optional reading. No review required. | |
3/3 | Data Orchestration slides recording | ||||
8 | 3/8 | Sparsity slides recording | SCNN'ISCA2017 | [Submit your review here.] | |
3/10 | Co-Design slides recording | ||||
9 | 3/15 | Guest Lecture: Configurable Cloud-Scale Real-Time Deep Learning, Bita Rouhani, Microsoft slides | No reading this week. | ||
3/17 | Other Operators & Near-Data slides recording | ||||
10 | 3/22 | Spring break! | No reading this week. | ||
3/24 | Spring break! | ||||
10 | 3/29 | Guest Lecture: Accelerating Software 2.0, Yaqi Zhang, SambaNova slides | No reading this week. | ||
3/31 | Training slides recording | ||||
11 | 4/5 | Accelerator-Level Parallelism slides recording | No reading this week. | ||
4/7 | Guest Lecture: Science to Fuel Neural Nets and TPU Design, Cliff Young, Google recording | ||||
12 | 4/12 | Guest Lecture: The Future of ML is Tiny and Bright, Vijay Janapa Reddi, Harvard University slides recording | No reading this week. | ||
4/14 | Advanced Technology slides recording | ||||
12 | 4/19 | Guest Lecture: Problems facing analog and in-memory computing, Brian Zimmer, NVIDIA recording | No reading this week. | ||
4/21 | End-to-end Deployment slides recording | ||||
13 | 4/26 | Conclusion slides recording | No reading this week. | ||
4/28 | Open office hour (no lecture). |