🔔 Share your dataset with the ML community!

Filter by Modality

Filter by Task (clear)

Filter by Language

15 dataset results for Reinforcement Learning (RL)

Procgen Benchmark includes 16 simple-to-use procedurally-generated environments which provide a direct measure of how quickly a reinforcement learning agent learns generalizable skills.

142 PAPERS • 1 BENCHMARK

ManiSkill2

ManiSkill2 is the next generation of the SAPIEN ManiSkill benchmark, to address critical pain points often encountered by researchers when using benchmarks for generalizable manipulation skills. It includes 20 manipulation task families with 2000+ object models and 4M+ demonstration frames, which cover stationary/mobile-base, single/dual-arm, and rigid/soft-body manipulation tasks with 2D/3D input data simulated by fully dynamic engines.

22 PAPERS • NO BENCHMARKS YET

PRM800K

PRM800K is a process supervision dataset containing 800,000 step-level correctness labels for model-generated solutions to problems from the MATH dataset.

15 PAPERS • NO BENCHMARKS YET

SMACv2

SMACv2 (StarCraft Multi-Agent Challenge v2) is a new version of the benchmark where scenarios are procedurally generated and require agents to generalise to previously unseen settings (from the same distribution) during evaluation.

14 PAPERS • NO BENCHMARKS YET

V-D4RL

V-D4RL provides pixel-based analogues of the popular D4RL benchmarking tasks, derived from the dm_control suite, along with natural extensions of two state-of-the-art online pixel-based continuous control algorithms, DrQ-v2 and DreamerV2, to the offline setting.

9 PAPERS • NO BENCHMARKS YET

QDax

QDax is a benchmark suite designed for for Deep Neuroevolution in Reinforcement Learning domains for robot control. The suite includes the definition of tasks, environments, behavioral descriptors, and fitness. It specify different benchmarks based on the complexity of both the task and the agent controlled by a deep neural network. The benchmark uses standard Quality-Diversity metrics, including coverage, QD-score, maximum fitness, and an archive profile metric to quantify the relation between coverage and fitness.

5 PAPERS • NO BENCHMARKS YET

Avalon

Avalon is a benchmark for generalization in Reinforcement Learning (RL). The benchmark consists of a set of tasks in which embodied agents in highly diverse procedural 3D worlds must survive by navigating terrain, hunting or gathering food, and avoiding hazards. Avalon is unique among existing RL benchmarks in that the reward function, world dynamics, and action space are the same for every task, with tasks differentiated solely by altering the environment; its 20 tasks, ranging in complexity from eat and throw to hunt and navigate, each create worlds in which the agent must perform specific skills in order to survive. This benchmark setup enables investigations of generalization within tasks, between tasks, and to compositional tasks that require combining skills learned from previous tasks.

3 PAPERS • NO BENCHMARKS YET

CivRealm

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

3 PAPERS • NO BENCHMARKS YET

FinRL-Meta

FinRL-Meta is universe of market environments for data-driven financial reinforcement learning. It follows the de facto standard of OpenAI Gym and the lean principle of software development. It has the following unique features of layered structure and extensibility, training-testing-trading pipeline and plug-and-play mode.

3 PAPERS • NO BENCHMARKS YET

MIDGARD

MIDGARD is an open-source simulator for autonomous robot navigation in outdoor unstructured environments. It is designed to enable the training of autonomous agents (e.g., unmanned ground vehicles) in photorealistic 3D environments, and support the generalization skills of learning-based agents thanks to the variability in training scenarios.

2 PAPERS • NO BENCHMARKS YET

POPGym (Partially Observable Process Gym)

POPGym is designed to benchmark memory in deep reinforcement learning. It contains a set of environments and a collection of memory model baselines. The environments are all Partially Observable Markov Decision Process (POMDP) environments following the Openai Gym interface. Our environments follow a few basic tenets:

2 PAPERS • 1 BENCHMARK

PushWorld

PushWorld is an environment with simplistic physics that requires manipulation planning with both movable obstacles and tools. It contains more than 200 PushWorld puzzles in PDDL and in an OpenAI Gym environment.

1 PAPER • NO BENCHMARKS YET

RoomEnv-v1

RoomEnv-v1 (The Room environment - v1)

The Room environment - v1 For the documentation of RoomEnv-v0, click the corresponding buttons.

1 PAPER • NO BENCHMARKS YET

lilGym

lilGym is a benchmark for language-conditioned reinforcement learning in visual environment based on 2,661 highly-compositional human-written natural language statements grounded in an interactive visual environment. Each statement is paired with multiple start states and reward functions to form thousands of distinct Markov Decision Processes of varying difficulty.

1 PAPER • NO BENCHMARKS YET

opin-pref

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

1 PAPER • NO BENCHMARKS YET

Datasets

15 dataset results for Reinforcement Learning (RL)