Struct renforce::trainer::LSPolicyIteration
[−]
[src]
pub struct LSPolicyIteration<F: Float> { /* fields omitted */ }
Least-squares Policy Iteration method
- Uses LSTD-Q for calculating the Q-function associated with a policy
- Only trains linear Q-functions (not currently enforced by library)
Methods
impl<F: Float> LSPolicyIteration<F>
[src]
fn new(gamma: F) -> LSPolicyIteration<F>
Constructs a new LSPolicyIteration with randomly initialized mean and deviation
fn gamma(self, gamma: F) -> LSPolicyIteration<F>
Updates gamma field of self
Trait Implementations
impl<F: Debug + Float> Debug for LSPolicyIteration<F>
[src]
impl<F: Float + 'static, S: Space, A: Space, T> BatchTrainer<S, A, T> for LSPolicyIteration<F> where T: Agent<S, A> + ParameterizedFunc<F> + FeatureExtractor<S, A, F>
[src]
fn train(&mut self, agent: &mut T, transitions: Vec<Transition<S, A>>)
Trains agent based on the observed transitions
impl<F: Float> Default for LSPolicyIteration<F>
[src]
fn default() -> LSPolicyIteration<F>
Creates a new LSPolicyIteration with gamma = 0.99