IE8571: Advanced Reinforcement Learning and Dynamic Programming

4 Credits

Markov Decision Processes (MDPs) form a rich class of mathematical models for sequential decision problems under uncertainty and provide a rigorous foundation for Reinforcement Learning (RL). The first part of this course will combine techniques from optimization and stochastics to build a modeling, theoretical, and algorithmic foundation for MDPs. Topics such as finite- and infinite-horizon MDPs; Bellman?s equations of dynamic programming; value iteration, policy iteration, and linear programming-based solution algorithms; partially observable MDPs; robust MDPs; stochastic games; continuous-time MDPs; semi-Markov decision processes; and continuous-time deterministic control will be covered. The second part of the course will build on this foundation to introduce fundamental ideas and solution techniques in RL. These will include Monte Carlo Policy Iteration, Q-learning, Temporal-Difference Learning, and Neuro-Dynamic Programming. Prereq: knowledge of optimization and stochastic models at the undergraduate level and familiarity with a computer programming language such as Python.

View on University Catalog

All Instructors

A- Average (3.787)Most Common: A (80%)

This total also includes data from semesters with unknown instructors.

25 students
FDCBA
  • 5.67

    /6

    Recommend
  • 5.00

    /6

    Effort
  • 6.00

    /6

    Understanding
  • 5.67

    /6

    Interesting
  • 5.67

    /6

    Activities


      Contribute on our Github

      Gopher Grades is maintained by Social Coding with data from Summer 2017 to Fall 2025 provided by the University in response to a public records request

      Not affiliated with the University of Minnesota

      Privacy Policy