What is a value-gradient?
|
|
|
The above diagram is a section of the State-Space View with the grey-scale background showing the values of the value-function. The value gradients (the cyan lines) show the direction of greatest increasing whiteness of the grey scale. In this diagram this direction is down and right. Mathematically, this is the “grad” operator applied to the value function. Value gradients are important because it is the value gradients that determine what actions the greedy policy chooses, since the greedy policy aims to move the spacecraft in whichever direction is “best”, and the value-function represents the estimation of “goodness” of any state. This is the simple reason why value gradients are the best way to learn the value-function. |