Module renforce::trainer
[−]
[src]
Trainer Module
Structs
CrossEntropy |
Cross Entropy method for parameter selection |
DynaQ |
Represents an OnlineTrainer for Q-functions Uses the Dyna-Q algorithm |
FittedQIteration |
BatchTrainer for Q-functions Uses Fitted Q Iteration |
LSPolicyIteration |
Least-squares Policy Iteration method |
PolicyGradient |
A variation of the Vanilla Policy Gradient algorithm |
QLearner |
Represents an OnlineTrainer for Q-functions Uses the Q-learning algorithm |
SARSALearner |
Represents an OnlineTrainer for Q-functions Uses the SARSA algorithm |
Traits
BatchTrainer |
Represents a way to train an agent from a set of transitions |
EpisodicTrainer |
Trains agents 1 "episode" at a time |
OnlineTrainer |
Represents a way to train an agent online (by interacting with the environment) |