doi.org/10.5281/zenodo.8123988

Preview meta tags from the doi.org website.

Linked Hostnames

17

Search Engine Appearance

Google

https://doi.org/10.5281/zenodo.8123988

DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument

SB3 Contrib (more algorithms): https://github.com/Stable-Baselines-Team/stable-baselines3-contrib RL Zoo3 (training framework): https://github.com/DLR-RM/rl-baselines3-zoo Stable-Baselines Jax (SBX): https://github.com/araffin/sbx To upgrade: pip install stable_baselines3 sb3_contrib rl_zoo3 --upgrade New Features: Added support for n-step returns for off-policy algorithms via the n_steps parameter from stable_baselines3 import SAC # SAC with n-step returns model = SAC("MlpPolicy", "Pendulum-v1", n_steps=3, verbose=1) model.learn(10_000) Added NStepReplayBuffer that allows to compute n-step returns without additional memory requirement (and without for loops) Added Gymnasium v1.2 support Bug Fixes: Fixed docker GPU image (PyTorch GPU was not installed) Fixed segmentation faults caused by non-portable schedules during model loading (@akanto) SB3-Contrib Added support for n-step returns for off-policy algorithms via the n_steps parameter Use the FloatSchedule and LinearSchedule classes instead of lambdas in the ARS, PPO, and QRDQN implementations to improve model portability across different operating systems RL Zoo linear_schedule now returns a SimpleLinearSchedule object for better portability Renamed LunarLander-v2 to LunarLander-v3 in hyperparameters Renamed CarRacing-v2 to CarRacing-v3 in hyperparameters Docker GPU images are now working again Use ConstantSchedule, and SimpleLinearSchedule instead of constant_fn and linear_schedule Fixed CarRacing-v3 hyperparameters for newer Gymnasium version SBX (SB3 + Jax) Added support for n-step returns for off-policy algorithms via the n_steps parameter Added KL Adaptive LR for PPO and LR schedule for SAC/TQC Deprecations: get_schedule_fn(), get_linear_fn(), constant_fn() are deprecated, please use FloatSchedule(), LinearSchedule(), ConstantSchedule() instead Documentation: Clarify evaluate_policy documentation Added doc about training exceeding the total_timesteps parameter Updated LunarLander and LunarLanderContinuous environment versions to v3 (@j0m0k0) Added sb3-extra-buffers to the project page (@Trenza1ore) New Contributors @akanto made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2125 @omahs made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2140 @j0m0k0 made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2143 @leopardracer made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2147 @Trenza1ore made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2157 Full Changelog: https://github.com/DLR-RM/stable-baselines3/compare/v2.6.0...v2.7.0



Bing

DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument

https://doi.org/10.5281/zenodo.8123988

SB3 Contrib (more algorithms): https://github.com/Stable-Baselines-Team/stable-baselines3-contrib RL Zoo3 (training framework): https://github.com/DLR-RM/rl-baselines3-zoo Stable-Baselines Jax (SBX): https://github.com/araffin/sbx To upgrade: pip install stable_baselines3 sb3_contrib rl_zoo3 --upgrade New Features: Added support for n-step returns for off-policy algorithms via the n_steps parameter from stable_baselines3 import SAC # SAC with n-step returns model = SAC("MlpPolicy", "Pendulum-v1", n_steps=3, verbose=1) model.learn(10_000) Added NStepReplayBuffer that allows to compute n-step returns without additional memory requirement (and without for loops) Added Gymnasium v1.2 support Bug Fixes: Fixed docker GPU image (PyTorch GPU was not installed) Fixed segmentation faults caused by non-portable schedules during model loading (@akanto) SB3-Contrib Added support for n-step returns for off-policy algorithms via the n_steps parameter Use the FloatSchedule and LinearSchedule classes instead of lambdas in the ARS, PPO, and QRDQN implementations to improve model portability across different operating systems RL Zoo linear_schedule now returns a SimpleLinearSchedule object for better portability Renamed LunarLander-v2 to LunarLander-v3 in hyperparameters Renamed CarRacing-v2 to CarRacing-v3 in hyperparameters Docker GPU images are now working again Use ConstantSchedule, and SimpleLinearSchedule instead of constant_fn and linear_schedule Fixed CarRacing-v3 hyperparameters for newer Gymnasium version SBX (SB3 + Jax) Added support for n-step returns for off-policy algorithms via the n_steps parameter Added KL Adaptive LR for PPO and LR schedule for SAC/TQC Deprecations: get_schedule_fn(), get_linear_fn(), constant_fn() are deprecated, please use FloatSchedule(), LinearSchedule(), ConstantSchedule() instead Documentation: Clarify evaluate_policy documentation Added doc about training exceeding the total_timesteps parameter Updated LunarLander and LunarLanderContinuous environment versions to v3 (@j0m0k0) Added sb3-extra-buffers to the project page (@Trenza1ore) New Contributors @akanto made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2125 @omahs made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2140 @j0m0k0 made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2143 @leopardracer made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2147 @Trenza1ore made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2157 Full Changelog: https://github.com/DLR-RM/stable-baselines3/compare/v2.6.0...v2.7.0



DuckDuckGo

https://doi.org/10.5281/zenodo.8123988

DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument

SB3 Contrib (more algorithms): https://github.com/Stable-Baselines-Team/stable-baselines3-contrib RL Zoo3 (training framework): https://github.com/DLR-RM/rl-baselines3-zoo Stable-Baselines Jax (SBX): https://github.com/araffin/sbx To upgrade: pip install stable_baselines3 sb3_contrib rl_zoo3 --upgrade New Features: Added support for n-step returns for off-policy algorithms via the n_steps parameter from stable_baselines3 import SAC # SAC with n-step returns model = SAC("MlpPolicy", "Pendulum-v1", n_steps=3, verbose=1) model.learn(10_000) Added NStepReplayBuffer that allows to compute n-step returns without additional memory requirement (and without for loops) Added Gymnasium v1.2 support Bug Fixes: Fixed docker GPU image (PyTorch GPU was not installed) Fixed segmentation faults caused by non-portable schedules during model loading (@akanto) SB3-Contrib Added support for n-step returns for off-policy algorithms via the n_steps parameter Use the FloatSchedule and LinearSchedule classes instead of lambdas in the ARS, PPO, and QRDQN implementations to improve model portability across different operating systems RL Zoo linear_schedule now returns a SimpleLinearSchedule object for better portability Renamed LunarLander-v2 to LunarLander-v3 in hyperparameters Renamed CarRacing-v2 to CarRacing-v3 in hyperparameters Docker GPU images are now working again Use ConstantSchedule, and SimpleLinearSchedule instead of constant_fn and linear_schedule Fixed CarRacing-v3 hyperparameters for newer Gymnasium version SBX (SB3 + Jax) Added support for n-step returns for off-policy algorithms via the n_steps parameter Added KL Adaptive LR for PPO and LR schedule for SAC/TQC Deprecations: get_schedule_fn(), get_linear_fn(), constant_fn() are deprecated, please use FloatSchedule(), LinearSchedule(), ConstantSchedule() instead Documentation: Clarify evaluate_policy documentation Added doc about training exceeding the total_timesteps parameter Updated LunarLander and LunarLanderContinuous environment versions to v3 (@j0m0k0) Added sb3-extra-buffers to the project page (@Trenza1ore) New Contributors @akanto made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2125 @omahs made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2140 @j0m0k0 made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2143 @leopardracer made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2147 @Trenza1ore made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2157 Full Changelog: https://github.com/DLR-RM/stable-baselines3/compare/v2.6.0...v2.7.0

  • General Meta Tags

    41
    • title
      DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument
    • charset
      utf-8
    • X-UA-Compatible
      IE=edge
    • viewport
      width=device-width, initial-scale=1
    • google-site-verification
      5fPGCLllnWrvFxH9QWI0l1TadV7byeEvfPcyK2VkS_s
  • Open Graph Meta Tags

    4
    • og:title
      DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument
    • og:description
      SB3 Contrib (more algorithms): https://github.com/Stable-Baselines-Team/stable-baselines3-contrib RL Zoo3 (training framework): https://github.com/DLR-RM/rl-baselines3-zoo Stable-Baselines Jax (SBX): https://github.com/araffin/sbx To upgrade: pip install stable_baselines3 sb3_contrib rl_zoo3 --upgrade New Features: Added support for n-step returns for off-policy algorithms via the n_steps parameter from stable_baselines3 import SAC # SAC with n-step returns model = SAC("MlpPolicy", "Pendulum-v1", n_steps=3, verbose=1) model.learn(10_000) Added NStepReplayBuffer that allows to compute n-step returns without additional memory requirement (and without for loops) Added Gymnasium v1.2 support Bug Fixes: Fixed docker GPU image (PyTorch GPU was not installed) Fixed segmentation faults caused by non-portable schedules during model loading (@akanto) SB3-Contrib Added support for n-step returns for off-policy algorithms via the n_steps parameter Use the FloatSchedule and LinearSchedule classes instead of lambdas in the ARS, PPO, and QRDQN implementations to improve model portability across different operating systems RL Zoo linear_schedule now returns a SimpleLinearSchedule object for better portability Renamed LunarLander-v2 to LunarLander-v3 in hyperparameters Renamed CarRacing-v2 to CarRacing-v3 in hyperparameters Docker GPU images are now working again Use ConstantSchedule, and SimpleLinearSchedule instead of constant_fn and linear_schedule Fixed CarRacing-v3 hyperparameters for newer Gymnasium version SBX (SB3 + Jax) Added support for n-step returns for off-policy algorithms via the n_steps parameter Added KL Adaptive LR for PPO and LR schedule for SAC/TQC Deprecations: get_schedule_fn(), get_linear_fn(), constant_fn() are deprecated, please use FloatSchedule(), LinearSchedule(), ConstantSchedule() instead Documentation: Clarify evaluate_policy documentation Added doc about training exceeding the total_timesteps parameter Updated LunarLander and LunarLanderContinuous environment versions to v3 (@j0m0k0) Added sb3-extra-buffers to the project page (@Trenza1ore) New Contributors @akanto made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2125 @omahs made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2140 @j0m0k0 made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2143 @leopardracer made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2147 @Trenza1ore made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2157 Full Changelog: https://github.com/DLR-RM/stable-baselines3/compare/v2.6.0...v2.7.0
    • og:url
      https://zenodo.org/records/16419043
    • og:site_name
      Zenodo
  • Twitter Meta Tags

    4
    • twitter:card
      summary
    • twitter:site
      @zenodo_org
    • twitter:title
      DLR-RM/stable-baselines3: v2.7.0: n-step returns for all off-policy algorithms via the `n_steps` argument
    • twitter:description
      SB3 Contrib (more algorithms): https://github.com/Stable-Baselines-Team/stable-baselines3-contrib RL Zoo3 (training framework): https://github.com/DLR-RM/rl-baselines3-zoo Stable-Baselines Jax (SBX): https://github.com/araffin/sbx To upgrade: pip install stable_baselines3 sb3_contrib rl_zoo3 --upgrade New Features: Added support for n-step returns for off-policy algorithms via the n_steps parameter from stable_baselines3 import SAC # SAC with n-step returns model = SAC("MlpPolicy", "Pendulum-v1", n_steps=3, verbose=1) model.learn(10_000) Added NStepReplayBuffer that allows to compute n-step returns without additional memory requirement (and without for loops) Added Gymnasium v1.2 support Bug Fixes: Fixed docker GPU image (PyTorch GPU was not installed) Fixed segmentation faults caused by non-portable schedules during model loading (@akanto) SB3-Contrib Added support for n-step returns for off-policy algorithms via the n_steps parameter Use the FloatSchedule and LinearSchedule classes instead of lambdas in the ARS, PPO, and QRDQN implementations to improve model portability across different operating systems RL Zoo linear_schedule now returns a SimpleLinearSchedule object for better portability Renamed LunarLander-v2 to LunarLander-v3 in hyperparameters Renamed CarRacing-v2 to CarRacing-v3 in hyperparameters Docker GPU images are now working again Use ConstantSchedule, and SimpleLinearSchedule instead of constant_fn and linear_schedule Fixed CarRacing-v3 hyperparameters for newer Gymnasium version SBX (SB3 + Jax) Added support for n-step returns for off-policy algorithms via the n_steps parameter Added KL Adaptive LR for PPO and LR schedule for SAC/TQC Deprecations: get_schedule_fn(), get_linear_fn(), constant_fn() are deprecated, please use FloatSchedule(), LinearSchedule(), ConstantSchedule() instead Documentation: Clarify evaluate_policy documentation Added doc about training exceeding the total_timesteps parameter Updated LunarLander and LunarLanderContinuous environment versions to v3 (@j0m0k0) Added sb3-extra-buffers to the project page (@Trenza1ore) New Contributors @akanto made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2125 @omahs made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2140 @j0m0k0 made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2143 @leopardracer made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2147 @Trenza1ore made their first contribution in https://github.com/DLR-RM/stable-baselines3/pull/2157 Full Changelog: https://github.com/DLR-RM/stable-baselines3/compare/v2.6.0...v2.7.0
  • Link Tags

    9
    • alternate
      https://zenodo.org/records/16419043/files/DLR-RM/stable-baselines3-v2.7.0.zip
    • apple-touch-icon
      /static/apple-touch-icon-120.png
    • apple-touch-icon
      /static/apple-touch-icon-152.png
    • apple-touch-icon
      /static/apple-touch-icon-167.png
    • apple-touch-icon
      /static/apple-touch-icon-180.png

Links

74