Talks and tutorials on RL
Introduction to Reinforcement Learning with Function Approximation: A tutorial given at NIPS 2015 by Richard Sutton.
Policy Search: Methods and Applications: A tutorial given at ICML 2015 by Jan Peters and Gerhard Neumann.
Representation and Learning Methods for Complex Outputs: Talk NIPS 2014 by Richard Sutton.
Value and Q-value recursion
There are two forms the expected reward for a given state is encoded:
The v-function is the expected reward given a state whilst the q-function is for a state and action. The recursive aspect of both these two functions can be derived from first principal and it can be shown that the v-function is a function of the q-function.
- See RVQ.pdf for the derivation of the recursion and the link between both functional forms.