Offline RL

226 papers with code • 2 benchmarks • 6 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Offline RL

Trend	Dataset	Best Model	Paper	Code	Compare
	D4RL	KFC			See all
	Walker2d	ParPI			See all

Libraries

Use these libraries to find Offline RL models and implementations

zzmtsvv/rl_task

14 papers

yihaosun1124/OfflineRL-Kit

8 papers

228

corl-team/CORL

7 papers

389

opendilab/DI-engine

4 papers

2,548

See all 10 libraries.

Datasets

Subtasks

DQN Replay Dataset

Most implemented papers

Most implemented Social Latest No code

Conservative Q-Learning for Offline Reinforcement Learning

aviralkumar2907/CQL • • NeurIPS 2020

We theoretically show that CQL produces a lower bound on the value of the current policy and that it can be incorporated into a policy learning procedure with theoretical improvement guarantees.

Paper
Code

Decision Transformer: Reinforcement Learning via Sequence Modeling

kzl/decision-transformer • • NeurIPS 2021

In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling.

Paper
Code

Reformer: The Efficient Transformer

google/trax • • ICLR 2020

Large Transformer models routinely achieve state-of-the-art results on a number of tasks but training these models can be prohibitively costly, especially on long sequences.

Paper
Code

Offline Reinforcement Learning with Implicit Q-Learning

rail-berkeley/rlkit • • 12 Oct 2021

The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.

Paper
Code

Rethinking Attention with Performers

google-research/google-research • • ICLR 2021

We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity, without relying on any priors such as sparsity or low-rankness.

Paper
Code

A Minimalist Approach to Offline Reinforcement Learning

sfujim/TD3_BC • • NeurIPS 2021

Offline reinforcement learning (RL) defines the task of learning from a fixed batch of data.

Paper
Code

D4RL: Datasets for Deep Data-Driven Reinforcement Learning

rail-berkeley/offline_rl • 15 Apr 2020

In this work, we introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL.

Paper
Code

MOPO: Model-based Offline Policy Optimization

tianheyu927/mopo • NeurIPS 2020

We also characterize the trade-off between the gain and risk of leaving the support of the batch data.

Paper
Code

Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention

idiap/fast-transformers • • ICML 2020

Transformers achieve remarkable performance in several tasks but due to their quadratic complexity, with respect to the input's length, they are prohibitively slow for very long sequences.

Paper
Code

Acme: A Research Framework for Distributed Reinforcement Learning

google-deepmind/acme • • 1 Jun 2020

These implementations serve both as a validation of our design decisions as well as an important contribution to reproducibility in RL research.

Paper
Code

Offline RL

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result