[Wiki] Lesson 7: Policy Gradient Methods


#1

Materials

  1. Lecture Video
  2. Lecture Slides

Derivation for policy update gradient:

Gradient Temporal Difference
Paper: http://proceedings.mlr.press/v24/silver12a/silver12a.pdf

Compatible Functeion Approximation:
Paper: http://proceedings.mlr.press/v32/silver14.pdf
Slides: http://www.inf.ed.ac.uk/teaching/courses/rl/slides14/rl14.pdf
slide 17 (according to google)

Additional Resources & Exercises

  1. Notes, reading materials & exercises from dennybritz
  2. Notes from dalmia