deepbots.supervisor.wrappers
- class deepbots.supervisor.wrappers.keyboard_printer.KeyboardPrinter(*args: Any, **kwargs: Any)[source]
Bases:
DeepbotsSupervisorEnv- step(action)[source]
On each timestep, the agent chooses an action for the previous observation, state_t, and the environment returns the next observation, state_t+1, the reward and whether the episode is done or not.
Each of the values returned is produced by implementations of other abstract methods defined below.
observation: The next observation from the environment reward: The amount of reward awarded on this step is_done: Whether the episode is done info: Diagnostic information mostly useful for debugging
- Parameters
action – The agent’s action
- Returns
tuple, (observation, reward, is_done, info)
- is_done()[source]
Used to inform the agent that the problem is solved.
This method is use-case specific and needs to be implemented by the user.
- Returns
bool, True if the episode is done
- get_observations()[source]
Return the observations of the robot. For example, metrics from sensors, a camera image, etc.
This method is use-case specific and needs to be implemented by the user.
- Returns
An object of observations
- get_reward(action)[source]
Calculates and returns the reward for this step.
This method is use-case specific and needs to be implemented by the user.
- Parameters
action – The agent’s action
- Returns
The amount of reward awarded on this step
- get_info()[source]
This method can be implemented to return any diagnostic information on each step, e.g. for debugging purposes.
- reset()[source]
Used to reset the world to an initial state.
Default, problem-agnostic, implementation of reset method, using Webots-provided methods.
*Note that this works properly only with Webots versions >R2020b and must be overridden with a custom reset method when using earlier versions. It is backwards compatible due to the fact that the new reset method gets overridden by whatever the user has previously implemented, so an old supervisor can be migrated easily to use this class.
- Returns
default observation provided by get_default_observation()
- class deepbots.supervisor.wrappers.tensorboard_wrapper.TensorboardLogger(*args: Any, **kwargs: Any)[source]
Bases:
DeepbotsSupervisorEnv- step(action)[source]
On each timestep, the agent chooses an action for the previous observation, state_t, and the environment returns the next observation, state_t+1, the reward and whether the episode is done or not.
Each of the values returned is produced by implementations of other abstract methods defined below.
observation: The next observation from the environment reward: The amount of reward awarded on this step is_done: Whether the episode is done info: Diagnostic information mostly useful for debugging
- Parameters
action – The agent’s action
- Returns
tuple, (observation, reward, is_done, info)
- is_done()[source]
Used to inform the agent that the problem is solved.
This method is use-case specific and needs to be implemented by the user.
- Returns
bool, True if the episode is done
- get_observations()[source]
Return the observations of the robot. For example, metrics from sensors, a camera image, etc.
This method is use-case specific and needs to be implemented by the user.
- Returns
An object of observations
- get_reward(action)[source]
Calculates and returns the reward for this step.
This method is use-case specific and needs to be implemented by the user.
- Parameters
action – The agent’s action
- Returns
The amount of reward awarded on this step
- get_info()[source]
This method can be implemented to return any diagnostic information on each step, e.g. for debugging purposes.
- reset()[source]
Used to reset the world to an initial state.
Default, problem-agnostic, implementation of reset method, using Webots-provided methods.
*Note that this works properly only with Webots versions >R2020b and must be overridden with a custom reset method when using earlier versions. It is backwards compatible due to the fact that the new reset method gets overridden by whatever the user has previously implemented, so an old supervisor can be migrated easily to use this class.
- Returns
default observation provided by get_default_observation()