Module renforce::trainer [−] [src]

Trainer Module

Structs

CrossEntropy	Cross Entropy method for parameter selection
DynaQ	Represents an OnlineTrainer for Q-functions Uses the Dyna-Q algorithm
FittedQIteration	BatchTrainer for Q-functions Uses Fitted Q Iteration
LSPolicyIteration	Least-squares Policy Iteration method
PolicyGradient	A variation of the Vanilla Policy Gradient algorithm
QLearner	Represents an OnlineTrainer for Q-functions Uses the Q-learning algorithm
SARSALearner	Represents an OnlineTrainer for Q-functions Uses the SARSA algorithm

BatchTrainer	Represents a way to train an agent from a set of transitions
EpisodicTrainer	Trains agents 1 "episode" at a time
OnlineTrainer	Represents a way to train an agent online (by interacting with the environment)