TAO is a federated dataset for Tracking Any Object, containing 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average. A bottom-up approach was used for discovering a large vocabulary of 833 categories, an order of magnitude more than prior tracking benchmarks.
38 PAPERS • 1 BENCHMARK
Gibson is an opensource perceptual and physics simulator to explore active and real-world perception. The Gibson Environment is used for Real-World Perception Learning.
21 PAPERS • NO BENCHMARKS YET
A platform for research in embodied artificial intelligence (AI).
14 PAPERS • NO BENCHMARKS YET
QDax is a benchmark suite designed for for Deep Neuroevolution in Reinforcement Learning domains for robot control. The suite includes the definition of tasks, environments, behavioral descriptors, and fitness. It specify different benchmarks based on the complexity of both the task and the agent controlled by a deep neural network. The benchmark uses standard Quality-Diversity metrics, including coverage, QD-score, maximum fitness, and an archive profile metric to quantify the relation between coverage and fitness.
5 PAPERS • NO BENCHMARKS YET
MIDGARD is an open-source simulator for autonomous robot navigation in outdoor unstructured environments. It is designed to enable the training of autonomous agents (e.g., unmanned ground vehicles) in photorealistic 3D environments, and support the generalization skills of learning-based agents thanks to the variability in training scenarios.
2 PAPERS • NO BENCHMARKS YET
A benchmark for detecting fallen people lying on the floor. It consists of 6982 images, with a total of 5023 falls and 2275 non falls corresponding to people in conventional situations (standing up, sitting, lying on the sofa or bed, walking, etc). Almost all the images have been captured in indoor environments with very different situations: variation of poses and sizes, occlusions, lighting changes, etc.
1 PAPER • NO BENCHMARKS YET
ISOD contains 2,000 manually labelled RGB-D images from 20 diverse sites, each featuring over 30 types of small objects randomly placed amidst the items already present in the scenes. These objects, typically ≤3cm in height, include LEGO blocks, rags, slippers, gloves, shoes, cables, crayons, chalk, glasses, smartphones (and their cases), fake banana peels, fake pet waste, and piles of toilet paper, among others. These items were chosen because they either threaten the safe operation of indoor mobile robots or create messes if run over.
MlGesture is a dataset for hand gesture recognition tasks, recorded in a car with 5 different sensor types at two different viewpoints. The dataset contains over 1300 hand gesture videos from 24 participants and features 9 different hand gesture symbols. One sensor cluster with five different cameras is mounted in front of the driver in the center of the dashboard. A second sensor cluster is mounted on the ceiling looking straight down.
A dataset with high resolution (4K) images and manually-annotated dense labels every 50 frames.
SEmantic Salient Instance Video (SESIV) dataset is obtained by augmenting the DAVIS-2017 benchmark dataset by assigning semantic ground-truth for salient instance labels. The SESIV dataset consists of 84 high-quality video sequences with pixel-wisely per-frame ground-truth labels.
A dataset collected in a set of experiments that involves human participants and a robot.
Sparrow-V0: A Reinforcement Learning Friendly Simulator for Mobile Robot
This dataset involves a 2D or 3D agent moving from a start to goal pose while interacting with nearby objects. These objects can influence position of the agent via attraction or repulsion forces as well as influence orientation via attraction to object's orientation. This dataset can be used to pre-train general policy behavior, which can be later fine-tuned quickly for a person's specific preferences. Example use-cases include: - self-driving cars maintaining distance from other cars - robot pick-and-place tasks with intermediate subtasks (ie: scanning factory items before dropping them off)
Overview The goal: using simulation data to train neural networks to estimate the pose of a rover's camera with respect to a known target object
0 PAPER • NO BENCHMARKS YET