Reinforcement Learning • 24 methods
Policy Gradient Methods try to optimize the policy function directly in reinforcement learning. This contrasts with, for example, Q-Learning, where the policy manifests itself as maximizing a value function. Below you can find a continuously updating catalog of policy gradient methods.
Method | Year | Papers |
---|---|---|
2017 | 629 | |
2015 | 190 | |
1999 | 160 | |
2018 | 90 | |
2015 | 71 | |
2016 | 70 | |
2016 | 48 | |
2018 | 45 | |
2017 | 33 | |
2014 | 16 | |
2018 | 15 | |
2016 | 11 | |
2018 | 10 | |
2018 | 6 | |
2017 | 2 | |
2017 | 2 | |
2020 | 2 | |
2017 | 1 | |
2018 | 1 | |
2020 | 1 | |
2021 | 1 | |
2021 | 1 | |
2000 | 1 |