web.archive.org/web/20191004033503/http:/ai.stanford.edu/blog/acteach

Preview meta tags from the web.archive.org website.

Linked Hostnames

1

Thumbnail

Search Engine Appearance

Google

https://web.archive.org/web/20191004033503/http:/ai.stanford.edu/blog/acteach

AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers

Reinforcement Learning (RL) algorithms have recently demonstrated impressive results in challenging problem domains such as robotic manipulation, Go, and Atari games. But, RL algorithms typically require a large number of interactions with the environment to train policies that solve new tasks, since they begin with no knowledge whatsoever about the task and rely on random exploration of their possible actions in order to learn. This is particularly problematic for physical domains such as robotics, where gathering experience from interactions is slow and expensive. At the same time, people often have some intution about the right kinds of things to do during RL tasks, such as approaching an object when attempting to grasp it – might it be possible for us to somehow communicate these intuitions to the RL agent to speed up its training?



Bing

AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers

https://web.archive.org/web/20191004033503/http:/ai.stanford.edu/blog/acteach

Reinforcement Learning (RL) algorithms have recently demonstrated impressive results in challenging problem domains such as robotic manipulation, Go, and Atari games. But, RL algorithms typically require a large number of interactions with the environment to train policies that solve new tasks, since they begin with no knowledge whatsoever about the task and rely on random exploration of their possible actions in order to learn. This is particularly problematic for physical domains such as robotics, where gathering experience from interactions is slow and expensive. At the same time, people often have some intution about the right kinds of things to do during RL tasks, such as approaching an object when attempting to grasp it – might it be possible for us to somehow communicate these intuitions to the RL agent to speed up its training?



DuckDuckGo

https://web.archive.org/web/20191004033503/http:/ai.stanford.edu/blog/acteach

AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers

Reinforcement Learning (RL) algorithms have recently demonstrated impressive results in challenging problem domains such as robotic manipulation, Go, and Atari games. But, RL algorithms typically require a large number of interactions with the environment to train policies that solve new tasks, since they begin with no knowledge whatsoever about the task and rely on random exploration of their possible actions in order to learn. This is particularly problematic for physical domains such as robotics, where gathering experience from interactions is slow and expensive. At the same time, people often have some intution about the right kinds of things to do during RL tasks, such as approaching an object when attempting to grasp it – might it be possible for us to somehow communicate these intuitions to the RL agent to speed up its training?

  • General Meta Tags

    10
    • title
      AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers | SAIL Blog
    • title
      AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers | The Stanford AI Lab Blog
    • charset
      utf-8
    • viewport
      width=device-width, initial-scale=1, maximum-scale=1
    • generator
      Jekyll v3.8.5
  • Open Graph Meta Tags

    6
    • og:title
      AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers
    • US country flagog:locale
      en_US
    • og:description
      Reinforcement Learning (RL) algorithms have recently demonstrated impressive results in challenging problem domains such as robotic manipulation, Go, and Atari games. But, RL algorithms typically require a large number of interactions with the environment to train policies that solve new tasks, since they begin with no knowledge whatsoever about the task and rely on random exploration of their possible actions in order to learn. This is particularly problematic for physical domains such as robotics, where gathering experience from interactions is slow and expensive. At the same time, people often have some intution about the right kinds of things to do during RL tasks, such as approaching an object when attempting to grasp it – might it be possible for us to somehow communicate these intuitions to the RL agent to speed up its training?
    • og:url
      https://web.archive.org/web/20191004041011/http://ai.stanford.edu/blog/acteach/
    • og:site_name
      SAIL Blog
  • Twitter Meta Tags

    5
    • twitter:title
      AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers
    • twitter:description
      Presenting AC-Teach, a unifying approach to leverage advice from an ensemble of sub-optimal teachers in order to accelerate the learning process of actor-critic reinforcement learning agents.
    • twitter:creator
      @StanfordAI
    • twitter:card
      summary_large_image
    • twitter:image
      https://web.archive.org/web/20191004041011im_/http://ai.stanford.edu/blog/assets/img/posts/2019-09-05-acteach/alg-mini.png
  • Link Tags

    13
    • alternate
      https://web.archive.org/web/20191004041011/http://ai.stanford.edu/blog/feed.xml
    • canonical
      https://web.archive.org/web/20191004041011/http://ai.stanford.edu/blog/acteach/
    • canonical
      https://web.archive.org/web/20191004041011/http://ai.stanford.edu/blog/acteach/
    • icon
      /web/20191004041011im_/http://ai.stanford.edu/blog/assets/img/favicon-32x32.png
    • icon
      /web/20191004041011im_/http://ai.stanford.edu/blog/assets/img/favicon-16x16.png

Links

21