Lecture | Topic | Notes | Scriber |
Th Aug 28 | Class outline. MDPs. | pdf tex | Pieter Abbeel |
Tu Sep 2 | Dynamic programming, value iteration, contractions. | pdf tex png | Anand Kulkarni |
Th Sep 4 | Contractions, asynchronous value iteration | pdf tex | Yan Zhang |
Tu Sep 9 | Policy iteration, function approximation | pdf tex figures | Fernando Garcia Bermudez |
Th Sep 11 | Function approximation | pdf tex | Nimbus Goehausen |
Tu Sep 16 | LQR | pdf tex | Ankur Mehta |
Th Sep 18 | DDP | pdf tex | Brandon Basso |
Tu Sep 23 | Quadruped locomotion | zip | J. Zico Kolter |
Th Sep 25 | POMDP | pdf tex figures | Martin Moler Sorensen |
Tu Sep 30 | POMDP | pdf tex png | David Nachum |
Th Oct 2 | Bandits | pdf tex png | David Nachum |
Tu Oct 7 | Separation Principle, Dynamics Modeling | pdf tex | Pål From |
Th Oct 9 | Dynamics Modeling, Kalman Filtering | pdf tex | Andrew Wan |
Tu Oct 14 | Kalman Filtering | pdf tex | Jared Wood |
Th Oct 16 | Policy Gradient | pdf tex zip | Fernando Garcia Bermudez |
Tu Oct 21 | Policy Gradient | pdf tex zip | Jan Biermayer |
Th Oct 23 | TD, Sarsa, Q-learning | pdf tex | Yan Zhang |
Tu Oct 28 | TD, Sarsa, Q-learning, TD-Gammon | pdf tex figure | Anand Kulkarni |
Th Oct 30 | Reward Shaping | pdf tex | Pål From |
Tu Nov 4 | Exploration/Exploitation | pdf tex | Brandon Basso |
Th Nov 6 | No lecture | ||
Tu Nov 11 | Academic and Administrative Holiday | ||
Th Nov 13 | LP approach | pdf tex | Nimbus Goehausen |
Tu Nov 18 | Inverse reinforcement learning | pdf tex | Ankur Mehta |
Th Nov 20 | MPC, SLAM, Linearly solvable MDP's | pdf tex | Andrew Wan |
Tu Nov 25 | Learning to walk | pdf tex figures | Jared Wood |
Th Nov 27 | Happy Thanksgiving! | ||
Tu Dec 2 | Project presentations | Martin Moler Sorensen: logistics czar | |
Th Dec 4 | Project presentations | Jan, Jared, David, Brandon | Jan Biermayer: logistics czar |