| Lecture | Topic | Notes | Scriber |
| Th Aug 28 | Class outline. MDPs. | pdf   tex | Pieter Abbeel |
| Tu Sep 2 | Dynamic programming, value iteration, contractions. | pdf   tex   png | Anand Kulkarni |
| Th Sep 4 | Contractions, asynchronous value iteration | pdf   tex   | Yan Zhang |
| Tu Sep 9 | Policy iteration, function approximation | pdf   tex   figures | Fernando Garcia Bermudez |
| Th Sep 11 | Function approximation | pdf   tex   | Nimbus Goehausen |
| Tu Sep 16 | LQR | pdf   tex   | Ankur Mehta |
| Th Sep 18 | DDP | pdf   tex   | Brandon Basso |
| Tu Sep 23 | Quadruped locomotion | zip | J. Zico Kolter |
| Th Sep 25 | POMDP | pdf   tex   figures | Martin Moler Sorensen |
| Tu Sep 30 | POMDP | pdf   tex   png | David Nachum |
| Th Oct 2 | Bandits | pdf   tex   png | David Nachum |
| Tu Oct 7 | Separation Principle, Dynamics Modeling | pdf   tex   | Pål From |
| Th Oct 9 | Dynamics Modeling, Kalman Filtering | pdf   tex   | Andrew Wan |
| Tu Oct 14 | Kalman Filtering | pdf   tex   | Jared Wood |
| Th Oct 16 | Policy Gradient | pdf   tex   zip | Fernando Garcia Bermudez |
| Tu Oct 21 | Policy Gradient | pdf   tex   zip | Jan Biermayer |
| Th Oct 23 | TD, Sarsa, Q-learning | pdf   tex   | Yan Zhang |
| Tu Oct 28 | TD, Sarsa, Q-learning, TD-Gammon | pdf   tex   figure | Anand Kulkarni |
| Th Oct 30 | Reward Shaping | pdf   tex   | Pål From |
| Tu Nov 4 | Exploration/Exploitation | pdf   tex   | Brandon Basso |
| Th Nov 6 | No lecture | ||
| Tu Nov 11 | Academic and Administrative Holiday | ||
| Th Nov 13 | LP approach | pdf   tex   | Nimbus Goehausen |
| Tu Nov 18 | Inverse reinforcement learning | pdf   tex   | Ankur Mehta |
| Th Nov 20 | MPC, SLAM, Linearly solvable MDP's | pdf   tex   | Andrew Wan |
| Tu Nov 25 | Learning to walk | pdf   tex   figures | Jared Wood |
| Th Nov 27 | Happy Thanksgiving! | ||
| Tu Dec 2 | Project presentations | Martin Moler Sorensen: logistics czar | |
| Th Dec 4 | Project presentations | Jan, Jared, David, Brandon | Jan Biermayer: logistics czar |