API del Producte - Treball Final de Màster by Rafael Jesús Castaño Ribes, January 2021 Els mètodes derivats, si no ha estat sobreescrita, hereten la documentació de les seves superclasses. L'autoria de la documentació en aquests casos correspon al desenvolupador de la superclasse. index -> experiments.off_p_tsb_experiment | index ..\src\experiments\off_p_tsb_experiment.py |
Train and Eval class for off-policy TD learning agents feeded with batches of experiences from a replay buffer.
Attribution:
This class is an adaptation of the module train_eval.py from the SAC Agent examples from the TF-Agents library:
https://github.com/tensorflow/agents/blob/v0.6.0/tf_agents/agents/sac/examples/v2/train_eval.py
Copyrighted 2020 by the TF-Agents Authors and licensed under the Apache License, Version 2.0.
Changes added to the original code:
The code has been given a class format and has been generalized to work with any off-policy TD based TF-Agent, wich will have to be provided by
subclasses via the method _get_the_agent().
Moreover, a method called create_video has been added allowing the class to generate a video by playing the obtained exploitation policy on the environment. This method has been adapted from the example code provided by
Holt, S. (2020). Value and Policy based agents: https://pathtopioneer.com/blog/2020/07/rl-4, licensed under a Creative Commons CC BY-NC-SA 4.0 license.
Classes | ||||||||||
|
Data | ||
absolute_import = _Feature((2, 5, 0, 'alpha', 1), (3, 0, 0, 'alpha', 0), 16384) division = _Feature((2, 2, 0, 'alpha', 2), (3, 0, 0, 'alpha', 0), 8192) print_function = _Feature((2, 6, 0, 'alpha', 2), (3, 0, 0, 'alpha', 0), 65536) |