Live course

Fundamentals of Reinforcement Learning

Bandits, Bellman equations, Monte Carlo, TD learning, and planning.

An applied, foundations-first introduction to reinforcement learning for engineers and technically curious learners.

The course uses a small set of concrete environments and works the algorithms through carefully, often with actual numbers rather than only abstract notation.

It starts with bandits and Markov decision processes, then moves through Bellman equations, dynamic programming, Monte Carlo methods, temporal-difference learning, Q-learning, and tabular planning.

What the live course covers

The current Udemy syllabus follows the live MVP course structure in Notion, moving from the simplest bandit problems through Bellman equations, dynamic programming, Monte Carlo methods, TD learning, and planning.

Introduction and RL framing

Start with the reinforcement-learning setup itself before moving into algorithms.

  • Introduction // tour of the repo
  • Goals, rewards, returns and episodes
  • From bandits to MDPs
  • From bandits to MDPs // real world examples
  • From bandits to MDPs // Frozen Lake walkthrough
  • Policies and value functions

K-armed bandits

Build intuition for explore-exploit tradeoffs before adding state.

  • Section intro
  • Environment intro // KAB testbed
  • Environment walkthrough // KAB testbed
  • Hands on: K-armed bandits
  • Action-value methods // part 1 - greedy
  • Action-value methods // part 2 - epsilon-greedy
  • Action-value methods // part 3 - efficient implementation details
  • Optimistic initial values
  • Upper-confidence-bound action selection
  • Non-stationary bandits

Markov decision processes and Bellman equations

Move from bandits to full sequential decision problems with value functions and Bellman reasoning.

  • Section intro
  • Bellman equations // P1: expectation equation for v(s)
  • Bellman equations // P2: expectation equation for q(s, a)
  • Bellman equations // P3: optimality equation
  • Walkthrough // Bellman expectation
  • Walkthrough // Bellman optimality
  • Frozen Lake - Bellman equation analytic solutions
  • Walkthrough // Matrix inversion and summary

Dynamic programming

Solve known-model environments by alternating evaluation and improvement.

  • Section intro
  • Environment intro // Jack's Car Rental
  • Environment walkthrough// Jack's Car Rental
  • Hands on: Jack's Car Rental
  • Intro to dynamic programming
  • Policy evaluation // intro
  • Policy evaluation // walkthrough
  • Policy improvement // intro & proof
  • Policy improvement // walkthrough
  • Policy iteration
  • Value iteration // intro
  • Value iteration // walkthrough

Monte Carlo methods

Learn model-free prediction and control from full returns and sample episodes.

  • Section intro
  • Monte Carlo example (pentagram)
  • Intro to Monte Carlo
  • Environment intro // Blackjack
  • Environment walkthrough // Blackjack
  • Hands on: Blackjack
  • Monte Carlo prediction
  • MC Control: Exploring Starts
  • MC Control: On Policy
  • MC Control: Off Policy // prelims
  • MC Control: Off Policy // prediction & control

TD learning and planning

Bridge the Monte Carlo and dynamic-programming views, then add model-based planning.

  • Section intro - screencast // part 1 - preamble
  • Section intro - screencast // part 2 - formal intro
  • Intro to TD
  • Environment walkthrough // Cliff Walking
  • Hands on: Cliff walking
  • TD prediction
  • Q-learning
  • Intro to Planning
  • Tabular Dyna-Q // algo intro
  • Tabular Dyna-Q // walkthrough
  • Tabular Dyna-Q+

Close

  • Conclusion & congratulations

Student feedback

Selected written 5-star Udemy reviews. Many other ratings do not include text feedback.

Truly amazed and appreciate the effort that went in to make the concepts clear. I have looked at many RL courses till date, but by far this is simply the best course to understand the foundations of RL. Looking forward to more courses from Tom.

Raja | 4 months ago | 5 / 5

The course makes it easier for me to understand RL. Better than my university's class.

Rany | 6 months ago | 5 / 5

In reinforcement learning terms, me enrolling to this course was a bit of an "exploratory action" in my own learning policy :), I'm halfway through and I'm very happy with it. I'm working towards expanding my skill set as a game developer / creative technologist and it's been a great introduction to RL concepts.

Sina | 7 months ago | 5 / 5

In this course you have the opportunity to see how the concepts comes to play while coding.

Luan Assis | 5 months ago | 5 / 5

Course is very good to understand the math and main concept of RL. I wish we had more practical coding part of it. Assignments are good but coding sessions are always good.

Isaac Matthew | 10 months ago | 5 / 5

Very clear explanations of the theory behind the techniques, and a great code repo with exercises to cement and build on what's taught in the lectures. An excellent introduction to the world of Reinforcement Learning!

Mark | 1 year ago | 5 / 5

Absolutely amazing course. I have learnt so much about reinforcement learning! I would recommend to anyone wanting to learn more about AI. I had relatively low starting level but found the explanations really clear and helpful.

Jack | 1 year ago | 5 / 5

Solid.

Andrey | 1 year ago | 5 / 5