Prof. Dr. Sebastian Peitz
Chair of Safe Autonomous Systems, TU Dortmund
find a policy that maximizes the sum of future rewards!
Grid world
Chess board [Source]
Pendulum [Source]
Humanoid [Source]
Atari: Donkey Kong [Source]
In all chapters, we will distinguish between two core learning tasks in reinforcement learning:
By the end of the course, we will
Lectures
Exercises
Format:
Content: