RL-OptS

Reinforcement Learning of Optimal Search strategies

This library builds the necessary tools needed to study, replicate and develop reinforcement learning agents for target search problems, as well as a benchmark baselines with which to compare the. This library is based in three different publications:

“Optimal foraging strategies can be learned” by G. Muñoz-Gil, A. López-Incera, L. J. Fiderer and H. J. Briegel (2024). Here we developed agents able to learn how to forage efficiently in environments with multiple targets.
“Learning how to find targets in the micro-world: the case of intermittent active Brownian particles” by M. Caraglio, H. Kaur, L. Fiderer, A. López-Incera, H. J. Briegel, T. Franosch, and G. Muñoz-Gil (2024). In this case, we study the ability of agents to learn how to switch from passive to active diffusion to enhance their target search efficiency.
“Learning to reset in target search problems” by G. Muñoz-Gil, H. J. Briegel and M. Caraglio (2025). Here we extended the agents to be able to reset to the origin, a feature that has revolutionize target search problems in the last years.

Installation

You can access all these tools installing the python package rl_opts via Pypi:

pip install rl-opts

You can also opt for cloning the source repository and executing the following on the parent folder you just cloned the repo:

pip install -e rl_opts

This will install both the library and the necessary packages.

Tutorials

We have prepared a series of tutorials to guide you through the most important functionalities of the package. You can find them in the Tutorials folder of the Github repository or in the Tutorials tab of our webpage, with notebooks that will help you navigate the package as well as reproducing the results of our paper via minimal examples. In particular, we have three tutorials:

Learning to forage with RL : shows how to train a RL agent based on Projective Simulation agents to search targets in randomly distributed environments as the ones considered in our paper.
Learning to reset in target search problems : shows how to train a RL agent similar to the previous, but with the ability to reset to the origin, an action that is learned along its spatial dynamics.
Imitation learning : shows how to train a RL agent to imitate the policy of an expert equipped with a pre-trained policy. The latter is based on the benchmark strategies common in the literature.
Forangin benchmarks: beyond Lévy walks : shows how launch various benchmark strategies with which to compare the trained RL agents.

Cite

We kindly ask you to cite us if any of the previous material was useful for you. You can either cite this library:

@software{rlopts,
  author       = {Mu\~noz-Gil, Gorka and L\'opez-Incera, Andrea and Caraglio Michele and Fiderer, Lukas J. and Briegel, Hans J.},
  title        = {\uppercase{RL}-\uppercase{O}pt\uppercase{S}: Reinforcement Learning of Optimal Search Strategies},
  month        = jan,
  year         = 2024,
  publisher    = {Zenodo},
  version      = {v1.0},
  doi          = {10.5281/zenodo.10450489},
  url          = {https://doi.org/10.5281/zenodo.7727873}}

or the works it’s based on:

@article{munoz2024optimal,
  title={Optimal foraging strategies can be learned},
  author={Mu{\~n}oz-Gil, Gorka and L{\'o}pez-Incera, Andrea and Fiderer, Lukas J and Briegel, Hans J},
  journal={New Journal of Physics},
  volume={26},
  number={1},
  pages={013010},
  year={2024},
  publisher={IOP Publishing}
}

@misc{munoz2025learning,
      title={Learning to reset in target search problems}, 
      author={Gorka Muñoz-Gil and Hans J. Briegel and Michele Caraglio},
      year={2025},
      eprint={2503.11330},
      archivePrefix={arXiv},
      primaryClass={cond-mat.stat-mech},
      url={https://arxiv.org/abs/2503.11330}, 
}

```