Lecture | Topic | Notes | Scriber |
Th Aug 28 | Class outline. MDPs. | pdf   tex | Pieter Abbeel |
Tu Sep 2 | Dynamic programming, value iteration, contractions. | pdf   tex   png | Anand Kulkarni |
Th Sep 4 | Contractions, asynchronous value iteration | pdf   tex   | Yan Zhang |
Tu Sep 9 | Policy iteration, function approximation | pdf   tex   figures | Fernando Garcia Bermudez |
Th Sep 11 | Function approximation | pdf   tex   | Nimbus Goehausen |
Tu Sep 16 | LQR | pdf   tex   | Ankur Mehta |
Th Sep 18 | DDP | pdf   tex   | Brandon Basso |
Tu Sep 23 | Quadruped locomotion | zip | J. Zico Kolter |
Th Sep 25 | POMDP | pdf   tex   figures | Martin Moler Sorensen |
Tu Sep 30 | POMDP | pdf   tex   png | David Nachum |
Th Oct 2 | Bandits | pdf   tex   png | David Nachum |
Tu Oct 7 | Separation Principle, Dynamics Modeling | pdf   tex   | Pål From |
Th Oct 9 | Dynamics Modeling, Kalman Filtering | pdf   tex   | Andrew Wan |
Tu Oct 14 | Kalman Filtering | pdf   tex   | Jared Wood |
Th Oct 16 | Policy Gradient | pdf   tex   zip | Fernando Garcia Bermudez |
Tu Oct 21 | Policy Gradient | pdf   tex   zip | Jan Biermayer |
Th Oct 23 | TD, Sarsa, Q-learning | pdf   tex   | Yan Zhang |
Tu Oct 28 | TD, Sarsa, Q-learning, TD-Gammon | pdf   tex   figure | Anand Kulkarni |
Th Oct 30 | Reward Shaping | pdf   tex   | Pål From |
Tu Nov 4 | Exploration/Exploitation | pdf   tex   | Brandon Basso |
Th Nov 6 | No lecture | ||
Tu Nov 11 | Academic and Administrative Holiday | ||
Th Nov 13 | LP approach | pdf   tex   | Nimbus Goehausen |
Tu Nov 18 | Inverse reinforcement learning | pdf   tex   | Ankur Mehta |
Th Nov 20 | MPC, SLAM, Linearly solvable MDP's | pdf   tex   | Andrew Wan |
Tu Nov 25 | Learning to walk | pdf   tex   figures | Jared Wood |
Th Nov 27 | Happy Thanksgiving! | ||
Tu Dec 2 | Project presentations | Martin Moler Sorensen: logistics czar | |
Th Dec 4 | Project presentations | Jan, Jared, David, Brandon | Jan Biermayer: logistics czar |