Module renforce::trainer [] [src]

Trainer Module

Structs

CrossEntropy

Cross Entropy method for parameter selection

DynaQ

Represents an OnlineTrainer for Q-functions Uses the Dyna-Q algorithm

FittedQIteration

BatchTrainer for Q-functions Uses Fitted Q Iteration

LSPolicyIteration

Least-squares Policy Iteration method

PolicyGradient

A variation of the Vanilla Policy Gradient algorithm

QLearner

Represents an OnlineTrainer for Q-functions Uses the Q-learning algorithm

SARSALearner

Represents an OnlineTrainer for Q-functions Uses the SARSA algorithm

Traits

BatchTrainer

Represents a way to train an agent from a set of transitions

EpisodicTrainer

Trains agents 1 "episode" at a time

OnlineTrainer

Represents a way to train an agent online (by interacting with the environment)