API del Producte - Treball Final de Màster
by Rafael Jesús Castaño Ribes, January 2021

Els mètodes derivats, si no ha estat sobreescrita, hereten la documentació de les seves superclasses. L'autoria de la documentació en aquests casos correspon al desenvolupador de la superclasse.
index -> experiments.naf_experiment
index
..\src\experiments\naf_experiment.py

Train and Eval NAF.

 
Modules
       
tf_agents.networks.encoding_network
tf_agents.agents.sac.tanh_normal_projection_network
tensorflow

 
Classes
       
experiments.off_p_tsb_experiment.OffPolicyTimeStepBasedExperiment(builtins.object)
NafExperiment

 
class NafExperiment(experiments.off_p_tsb_experiment.OffPolicyTimeStepBasedExperiment)
    NafExperiment(root_dir='', num_eval_episodes=10, summary_interval=1000, env_name='BipedalWalker-v3', env_action_repeat_times=4, preprocessing_conv_layer_params=None, preprocessing_conv_type='1d', preprocessing_fc_layer_params=(256,), preprocessing_dropout_layer_params=None, v_network_fc_layer_params=(256,), v_network_dropout_layer_params=None, l_network_fc_layer_params=(256,), l_network_dropout_layer_params=None, policy_network_fc_layer_params=(256,), policy_network_dropout_layer_params=None, policy_network_uses_shared_preprocessing_network=True, learning_rate=0.0003, target_update_tau=0.005, target_update_period=1, td_errors_loss_fn=<function squared_difference at 0x0000022AE01448B8>, gamma=0.99, noise_factor=0.1, debug_summaries=False, summarize_grads_and_vars=False, replay_buffer_capacity=1000000, initial_collect_steps=10000, collect_steps_per_iteration=1, use_tf_functions=True, sample_batch_size=256, num_iterations=3000000, train_steps_per_iteration=1, log_interval=1000, eval_interval=10000, train_checkpoint_interval=50000, policy_checkpoint_interval=50000, replay_buffer_checkpoint_interval=50000, name='default_naf_experiment')
 
A simple train and eval class for a NAF Agent.
 
 
Method resolution order:
NafExperiment
experiments.off_p_tsb_experiment.OffPolicyTimeStepBasedExperiment
builtins.object

Methods defined here:
__init__(self, root_dir='', num_eval_episodes=10, summary_interval=1000, env_name='BipedalWalker-v3', env_action_repeat_times=4, preprocessing_conv_layer_params=None, preprocessing_conv_type='1d', preprocessing_fc_layer_params=(256,), preprocessing_dropout_layer_params=None, v_network_fc_layer_params=(256,), v_network_dropout_layer_params=None, l_network_fc_layer_params=(256,), l_network_dropout_layer_params=None, policy_network_fc_layer_params=(256,), policy_network_dropout_layer_params=None, policy_network_uses_shared_preprocessing_network=True, learning_rate=0.0003, target_update_tau=0.005, target_update_period=1, td_errors_loss_fn=<function squared_difference at 0x0000022AE01448B8>, gamma=0.99, noise_factor=0.1, debug_summaries=False, summarize_grads_and_vars=False, replay_buffer_capacity=1000000, initial_collect_steps=10000, collect_steps_per_iteration=1, use_tf_functions=True, sample_batch_size=256, num_iterations=3000000, train_steps_per_iteration=1, log_interval=1000, eval_interval=10000, train_checkpoint_interval=50000, policy_checkpoint_interval=50000, replay_buffer_checkpoint_interval=50000, name='default_naf_experiment')
Initialize self.  See help(type(self)) for accurate signature.

Methods inherited from experiments.off_p_tsb_experiment.OffPolicyTimeStepBasedExperiment:
copy(self)
launch(self)

Data descriptors inherited from experiments.off_p_tsb_experiment.OffPolicyTimeStepBasedExperiment:
__dict__
dictionary for instance variables (if defined)
__weakref__
list of weak references to the object (if defined)

 
Data
        absolute_import = _Feature((2, 5, 0, 'alpha', 1), (3, 0, 0, 'alpha', 0), 16384)
division = _Feature((2, 2, 0, 'alpha', 2), (3, 0, 0, 'alpha', 0), 8192)
print_function = _Feature((2, 6, 0, 'alpha', 2), (3, 0, 0, 'alpha', 0), 65536)