doi.org/10.5281/zenodo.8123988

Preview meta tags from the doi.org website.

Linked Hostnames

Search Engine Appearance

Google

https://doi.org/10.5281/zenodo.8123988

DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument

SB3 Contrib (more algorithms): https://github.com/Stable-Baselines-Team/stable-baselines3-contrib RL Zoo3 (training framework): https://github.com/DLR-RM/rl-baselines3-zoo Stable-Baselines Jax (SBX): https://github.com/araffin/sbx To upgrade: pip install stable_baselines3 sb3_contrib rl_zoo3 --upgrade New Features: Added support for n-step returns for off-policy algorithms via the n_steps parameter from stable_baselines3 import SAC # SAC with n-step returns model = SAC("MlpPolicy", "Pendulum-v1", n_steps=3, verbose=1) model.learn(10_000) Added NStepReplayBuffer that allows to compute n-step returns without additional memory requirement (and without for loops) Added Gymnasium v1.2 support Bug Fixes: Fixed docker GPU image (PyTorch GPU was not installed) Fixed segmentation faults caused by non-portable schedules during model loading (@akanto) SB3-Contrib Added support for n-step returns for off-policy algorithms via the n_steps parameter Use the FloatSchedule and LinearSchedule classes instead of lambdas in the ARS, PPO, and QRDQN implementations to improve model portability across different operating systems RL Zoo linear_schedule now returns a SimpleLinearSchedule object for better portability Renamed LunarLander-v2 to LunarLander-v3 in hyperparameters Renamed CarRacing-v2 to CarRacing-v3 in hyperparameters Docker GPU images are now working again Use ConstantSchedule, and SimpleLinearSchedule instead of constant_fn and linear_schedule Fixed CarRacing-v3 hyperparameters for newer Gymnasium version SBX (SB3 + Jax) Added support for n-step returns for off-policy algorithms via the n_steps parameter Added KL Adaptive LR for PPO and LR schedule for SAC/TQC Deprecations: get_schedule_fn(), get_linear_fn(), constant_fn() are deprecated, please use FloatSchedule(), LinearSchedule(), ConstantSchedule() instead Documentation: Clarify evaluate_policy documentation Added doc about training exceeding the total_timesteps parameter Updated LunarLander and LunarLanderContinuous environment versions to v3 (@j0m0k0) Added sb3-extra-buffers to the project page (@Trenza1ore) New Contributors @akanto made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2125 @omahs made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2140 @j0m0k0 made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2143 @leopardracer made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2147 @Trenza1ore made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2157 Full Changelog: https://github.com/DLR-RM/stable-baselines3/compare/v2.6.0...v2.7.0

Bing

DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument

https://doi.org/10.5281/zenodo.8123988

DuckDuckGo

https://doi.org/10.5281/zenodo.8123988

DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument

General Meta Tags
41
- title
  DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument
- charset
  utf-8
- X-UA-Compatible
  IE=edge
- viewport
  width=device-width, initial-scale=1
- google-site-verification
  5fPGCLllnWrvFxH9QWI0l1TadV7byeEvfPcyK2VkS_s
Open Graph Meta Tags
4
- og:title
  DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument
- og:description
  SB3 Contrib (more algorithms): https://github.com/Stable-Baselines-Team/stable-baselines3-contrib RL Zoo3 (training framework): https://github.com/DLR-RM/rl-baselines3-zoo Stable-Baselines Jax (SBX): https://github.com/araffin/sbx To upgrade: pip install stable_baselines3 sb3_contrib rl_zoo3 --upgrade New Features: Added support for n-step returns for off-policy algorithms via the n_steps parameter from stable_baselines3 import SAC # SAC with n-step returns model = SAC("MlpPolicy", "Pendulum-v1", n_steps=3, verbose=1) model.learn(10_000) Added NStepReplayBuffer that allows to compute n-step returns without additional memory requirement (and without for loops) Added Gymnasium v1.2 support Bug Fixes: Fixed docker GPU image (PyTorch GPU was not installed) Fixed segmentation faults caused by non-portable schedules during model loading (@akanto) SB3-Contrib Added support for n-step returns for off-policy algorithms via the n_steps parameter Use the FloatSchedule and LinearSchedule classes instead of lambdas in the ARS, PPO, and QRDQN implementations to improve model portability across different operating systems RL Zoo linear_schedule now returns a SimpleLinearSchedule object for better portability Renamed LunarLander-v2 to LunarLander-v3 in hyperparameters Renamed CarRacing-v2 to CarRacing-v3 in hyperparameters Docker GPU images are now working again Use ConstantSchedule, and SimpleLinearSchedule instead of constant_fn and linear_schedule Fixed CarRacing-v3 hyperparameters for newer Gymnasium version SBX (SB3 + Jax) Added support for n-step returns for off-policy algorithms via the n_steps parameter Added KL Adaptive LR for PPO and LR schedule for SAC/TQC Deprecations: get_schedule_fn(), get_linear_fn(), constant_fn() are deprecated, please use FloatSchedule(), LinearSchedule(), ConstantSchedule() instead Documentation: Clarify evaluate_policy documentation Added doc about training exceeding the total_timesteps parameter Updated LunarLander and LunarLanderContinuous environment versions to v3 (@j0m0k0) Added sb3-extra-buffers to the project page (@Trenza1ore) New Contributors @akanto made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2125 @omahs made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2140 @j0m0k0 made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2143 @leopardracer made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2147 @Trenza1ore made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2157 Full Changelog: https://github.com/DLR-RM/stable-baselines3/compare/v2.6.0...v2.7.0
- og:url
  https://zenodo.org/records/16419043
- og:site_name
  Zenodo
Twitter Meta Tags
4
- twitter:card
  summary
- twitter:site
  @zenodo_org
- twitter:title
  DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument
- twitter:description
  SB3 Contrib (more algorithms): https://github.com/Stable-Baselines-Team/stable-baselines3-contrib RL Zoo3 (training framework): https://github.com/DLR-RM/rl-baselines3-zoo Stable-Baselines Jax (SBX): https://github.com/araffin/sbx To upgrade: pip install stable_baselines3 sb3_contrib rl_zoo3 --upgrade New Features: Added support for n-step returns for off-policy algorithms via the n_steps parameter from stable_baselines3 import SAC # SAC with n-step returns model = SAC("MlpPolicy", "Pendulum-v1", n_steps=3, verbose=1) model.learn(10_000) Added NStepReplayBuffer that allows to compute n-step returns without additional memory requirement (and without for loops) Added Gymnasium v1.2 support Bug Fixes: Fixed docker GPU image (PyTorch GPU was not installed) Fixed segmentation faults caused by non-portable schedules during model loading (@akanto) SB3-Contrib Added support for n-step returns for off-policy algorithms via the n_steps parameter Use the FloatSchedule and LinearSchedule classes instead of lambdas in the ARS, PPO, and QRDQN implementations to improve model portability across different operating systems RL Zoo linear_schedule now returns a SimpleLinearSchedule object for better portability Renamed LunarLander-v2 to LunarLander-v3 in hyperparameters Renamed CarRacing-v2 to CarRacing-v3 in hyperparameters Docker GPU images are now working again Use ConstantSchedule, and SimpleLinearSchedule instead of constant_fn and linear_schedule Fixed CarRacing-v3 hyperparameters for newer Gymnasium version SBX (SB3 + Jax) Added support for n-step returns for off-policy algorithms via the n_steps parameter Added KL Adaptive LR for PPO and LR schedule for SAC/TQC Deprecations: get_schedule_fn(), get_linear_fn(), constant_fn() are deprecated, please use FloatSchedule(), LinearSchedule(), ConstantSchedule() instead Documentation: Clarify evaluate_policy documentation Added doc about training exceeding the total_timesteps parameter Updated LunarLander and LunarLanderContinuous environment versions to v3 (@j0m0k0) Added sb3-extra-buffers to the project page (@Trenza1ore) New Contributors @akanto made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2125 @omahs made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2140 @j0m0k0 made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2143 @leopardracer made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2147 @Trenza1ore made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2157 Full Changelog: https://github.com/DLR-RM/stable-baselines3/compare/v2.6.0...v2.7.0
Link Tags
9
- alternate
  https://zenodo.org/records/16419043/files/DLR-RM/stable-baselines3-v2.7.0.zip
- apple-touch-icon
  /static/apple-touch-icon-120.png
- apple-touch-icon
  /static/apple-touch-icon-152.png
- apple-touch-icon
  /static/apple-touch-icon-167.png
- apple-touch-icon
  /static/apple-touch-icon-180.png

doi.org/10.5281/zenodo.8123988

Linked Hostnames

Search Engine Appearance

Google

DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument

Bing

DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument

DuckDuckGo

DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument

General Meta Tags

Open Graph Meta Tags

Twitter Meta Tags

Link Tags

Links