Implementations of various RL Algorithms
Contents:
Implementation of selected algorithms from the book. I tried to make code snippets minimal and faithful to the book.
Part I: Tabular Solution Methods | |
Chapter 2: Multi-armed Bandits
|
![]() |
Chapter 4: Dynamic Programming
|
![]() |
Chapter 5: Monte Carlo Methods
|
![]() |
Chapter 6: Temporal-Difference Learning
|
![]() |
Part II: Approximate Solution Methods | |
Chapter 9: On-Policy Prediction with Approximation
|
![]() |
Chapter 10: On-Policy Control with Approximation
|
![]() |
A bit more in-depth explanation of selected concepts from David Sivler lectures and Sutton and Barto book.