= isBetween_c_Vec_numba(np.array([0.1,1]), np.array([1,3]), np.random.rand(100,2), 0.00001) compiling
Reinforcement learning environments
This notebook gathers the functions creating different kinds of environments for foraging and target search in various scenarios, adapted for their use in the reinforcement learning paradigm.
Helpers
isBetween
isBetween_c_Vec_numba
isBetween_c_Vec_numba (a, b, c, r)
Checks whether point c is crossing the line formed with point a and b.
Type | Details | |
---|---|---|
a | tensor, shape = (1,2) | Previous position. |
b | tensor, shape = (1,2) | Current position. |
c | tensor, shape = (Nt,2) | Positions of all targets. |
r | int/float | Target radius. |
Returns | array of boolean values | True at the indices of found targets. |
4.65 μs ± 25.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
from rl_opts.utils import isBetween_c_Vec as oldbetween
40.4 μs ± 177 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
Pareto sampling
pareto_sample
pareto_sample (alpha, xm, size=1)
Random sampling from array with probs
rand_choice_nb
rand_choice_nb (arr, prob)
:param arr: A 1D numpy array of values to sample from. :param prob: A 1D numpy array of probabilities for the given samples. :return: A random sample from the given array with a given probability.
TargetEnv
TargetEnv
TargetEnv (*args, **kwargs)
Class defining the a Foraging environment with multiple targets and two actions: continue in the same direction and turn by a random angle.
ResetEnv
Search loop with fixed policy an arbitrary environment
reset_search_loop
reset_search_loop (T, reset_policy, env)
Loop that runs the reset environment with a given reset policy.
Details | |
---|---|
T | Number of steps |
reset_policy | Reset policy |
env | Environment |
1D
ResetEnv_1D
ResetEnv_1D (*args, **kwargs)
Initialize self. See help(type(self)) for accurate signature.
Parallel search loops for Reset 1D
parallel_Reset1D_exp
parallel_Reset1D_exp (T, rates, L, D)
Runs the Reset 1D loop in parallel for different exponential resetting rates.
parallel_Reset1D_sharp
parallel_Reset1D_sharp (T, resets, L, D)
Runs the Reset 1D loop in parallel for different sharp resetting times.
2D
ResetEnv_2D
ResetEnv_2D (*args, **kwargs)
Initialize self. See help(type(self)) for accurate signature.
Parallel search loops for Reset 2D
parallel_Reset2D_policies
parallel_Reset2D_policies (T, reset_policies, dist_target, radius_target, D)
parallel_Reset2D_exp
parallel_Reset2D_exp (T, rates, dist_target, radius_target, D)
parallel_Reset2D_sharp
parallel_Reset2D_sharp (T, resets, dist_target, radius_target, D)
TurnResetEnv
Only 2D is considered
TurnResetEnv_2D
TurnResetEnv_2D (*args, **kwargs)
Class defining a Foraging environment with a single target and three possible actions:
- Continue in the same direction
- Turn by a random angle
- Reset to the origin
The agent makes steps of constant length given by agent_step.
Search loop with fixed policy
search_loop_turn_reset_sharp
search_loop_turn_reset_sharp (T, reset, turn, env)
Runs a search loop of T steps. There is a single counter that works as follows:
- Starts at 0
- For each turn or continue action gets +1
- If reset or reach the target is set to 0