RL-OptS

Utils

  • Get started
  • Documentation
    • RL framework
      • Classic version
      • numba implementation
        • Reinforcement learning environments
        • Reinforcement learning agents
    • Learning and benchmarking
    • Imitation learning
    • Analytical functions
    • Utils
  • Tutorials
    • Reinforcement Learning
    • Benchmarks
    • Imitation learning
    • Learning to reset in target search problems

On this page

  • Helpers for the environments
    • isBetween_c_Vec
    • coord_mod
    • isBetween_c_Vec_nAgents
    • get_encounters
  • Loading data
    • get_config
    • get_policy
    • get_performance
    • get_opt

Report an issue

Utils

This notebook gathers multiple functions used as helpers in the library.

Helpers for the environments


source

isBetween_c_Vec

 isBetween_c_Vec (a, b, c, r)

Checks whether point c is crossing the line formed with point a and b.

Type Details
a tensor, shape = (1,2) Previous position.
b tensor, shape = (1,2) Current position.
c tensor, shape = (Nt,2) Positions of all targets.
r int/float Target radius.
Returns array of boolean values True at the indices of found targets.

source

coord_mod

 coord_mod (coord1, coord2, mod)

Computes the distance difference between two coordinates, in a world with size ‘mod’ and periodic boundary conditions.

Type Details
coord1 value, np.array, tensor (can be shape=(n,1)) First coordinate.
coord2 np.array, tensor – shape=(1,1) Second coordinate, substracted from coord1.
mod int World size.
Returns float Distance difference (with correct sign, not absolute value).

source

isBetween_c_Vec_nAgents

 isBetween_c_Vec_nAgents (a, b, c, r)

Checks whether point c is crossing the line formed with point a and b. Code to run several agents in parallel.

Type Details
a array, shape = (n,2) Previous position of all n agents.
b array, shape = (n,2) Current position of all n agents.
c array, shape = (Nt,2) Positions of all targets.
r float Target radius
Returns array of boolean values, shape = (Nt, n) True at the indices of found targets.

source

get_encounters

 get_encounters (agent_previous_pos, agent_pos, target_positions, L, r)

Considering the agent walks, it checks whether the agent finds a target while walking the current step. Code to run several agents in parallel.

Type Details
agent_previous_pos array, shape = (n,2) Position of the n agents before taking the step.
agent_pos array, shape = (n,2) Position of the n agents.
target_positions array, shape = (Nt,2) Positions of the targets.
L int World size.
r float Radius of the targets.
Returns array of boolean values, shape = (Nt, n) True at the indices of found targets.

Loading data


source

get_config

 get_config (config, config_path='configurations/learning/')

Get the configuration file of the given experiment and config name (e.g. exp_0).

Type Default Details
config str Config name (e.g. exp_0)
config_path str configurations/learning/ path to configurations
Returns dict Dictionary with the parameters of the loaded configuration.

source

get_policy

 get_policy (results_path, agent, episode)

Gets the policy of an agent at a given episode.

Type Details
results_path str Path of the folder from which to extract the data.
agent int Agent index.
episode int Episode.
Returns list Policy.

source

get_performance

 get_performance (results_path, agent_list, episode_list)

Extract data with the efficiencies obtained in the postlearning analysis.

Type Details
results_path str Path of the folder from which to extract the data.
agent_list list List with the agent indices.
episode_list list List with the episodes.
Returns np.array, shape=(len(agent_list), len(episode_list)) Average performances obtained by the agents in the postlearning analysis.

source

get_opt

 get_opt (path, df)

Get the highest efficiency obtained by the benchmark models and the corresponding parameters.

Type Details
path str Path from which to get the data.
df panda dataframe Dataframe with the results from the optimization with Tune.
Returns list Efficiency of each walk.