Module renforce::trainer
[−]
[src]
Trainer Module
Structs
| CrossEntropy |
Cross Entropy method for parameter selection |
| DynaQ |
Represents an OnlineTrainer for Q-functions Uses the Dyna-Q algorithm |
| FittedQIteration |
BatchTrainer for Q-functions Uses Fitted Q Iteration |
| LSPolicyIteration |
Least-squares Policy Iteration method |
| PolicyGradient |
A variation of the Vanilla Policy Gradient algorithm |
| QLearner |
Represents an OnlineTrainer for Q-functions Uses the Q-learning algorithm |
| SARSALearner |
Represents an OnlineTrainer for Q-functions Uses the SARSA algorithm |
Traits
| BatchTrainer |
Represents a way to train an agent from a set of transitions |
| EpisodicTrainer |
Trains agents 1 "episode" at a time |
| OnlineTrainer |
Represents a way to train an agent online (by interacting with the environment) |