The ADE20K semantic segmentation dataset contains more than 20K scene-centric images exhaustively annotated with pixel-level objects and object parts labels. There are totally 150 semantic categories, which include stuffs like sky, road, grass, and discrete objects like person, car, bed.
995 PAPERS • 25 BENCHMARKS
Manga109 has been compiled by the Aizawa Yamasaki Matsui Laboratory, Department of Information and Communication Engineering, the Graduate School of Information Science and Technology, the University of Tokyo. The compilation is intended for use in academic research on the media processing of Japanese manga. Manga109 is composed of 109 manga volumes drawn by professional manga artists in Japan. These manga were commercially made available to the public between the 1970s and 2010s, and encompass a wide range of target readerships and genres (see the table in Explore for further details.) Most of the manga in the compilation are available at the manga library “Manga Library Z” (formerly the “Zeppan Manga Toshokan” library of out-of-print manga).
249 PAPERS • 12 BENCHMARKS
The Face Detection Dataset and Benchmark (FDDB) dataset is a collection of labeled faces from Faces in the Wild dataset. It contains a total of 5171 face annotations, where images are also of various resolution, e.g. 363x450 and 229x410. The dataset incorporates a range of challenges, including difficult pose angles, out-of-focus faces and low resolution. Both greyscale and color images are included.
163 PAPERS • 1 BENCHMARK
AFW (Annotated Faces in the Wild) is a face detection dataset that contains 205 images with 468 faces. Each face image is labeled with at most 6 landmarks with visibility labels, as well as a bounding box.
154 PAPERS • 1 BENCHMARK
UMDFaces is a face dataset divided into two parts:
30 PAPERS • NO BENCHMARKS YET
The PASCAL FACE dataset is a dataset for face detection and face recognition. It has a total of 851 images which are a subset of the PASCAL VOC and has a total of 1,341 annotations. These datasets contain only a few hundreds of images and have limited variations in face appearance.
23 PAPERS • 1 BENCHMARK
COCO-WholeBody is an extension of COCO dataset with whole-body annotations. There are 4 types of bounding boxes (person box, face box, left-hand box, and right-hand box) and 133 keypoints (17 for body, 6 for feet, 68 for face and 42 for hands) annotations for each person in the image.
22 PAPERS • 6 BENCHMARKS
WIDER is a dataset for complex event recognition from static images. As of v0.1, it contains 61 event categories and around 50574 images annotated with event class labels.
20 PAPERS • 1 BENCHMARK
The MALF dataset is a large dataset with 5,250 images annotated with multiple facial attributes and it is specifically constructed for fine grained evaluation.
15 PAPERS • NO BENCHMARKS YET
Proposes three types of masked face detection dataset; namely, the Correctly Masked Face Dataset (CMFD), the Incorrectly Masked Face Dataset (IMFD) and their combination for the global masked face detection (MaskedFace-Net).
14 PAPERS • NO BENCHMARKS YET
Unconstrained Face Detection Dataset (UFDD) aims to fuel further research in unconstrained face detection.
13 PAPERS • NO BENCHMARKS YET
Real-World Masked Face Dataset (RMFD) is a large dataset for masked face detection.
10 PAPERS • NO BENCHMARKS YET
The iCartoonFace dataset is a large-scale dataset that can be used for two different tasks: cartoon face detection and cartoon face recognition.
7 PAPERS • 1 BENCHMARK
A large-scale, hierarchical annotated dataset of animal faces, featuring 21.9K faces from 334 diverse species and 21 animal orders across biological taxonomy. These faces are captured `in-the-wild' conditions and are consistently annotated with 9 landmarks on key facial features. The proposed dataset is structured and scalable by design; its development underwent four systematic stages involving rigorous, manual annotation effort of over 6K man-hours.
6 PAPERS • NO BENCHMARKS YET
Contains video clips shot with modern high-resolution mobile cameras, with strong projective distortions and with low lighting conditions.
5 PAPERS • NO BENCHMARKS YET
The DCM dataset is composed of 772 annotated images from 27 golden age comic books. We freely collected them from the free public domain collection of digitized comic books Digital Comics Museum. One album per available publisher was selected to get as many different styles as possible. We made ground-truth bounding boxes of all panels, all characters (body + faces), small or big, human-like or animal-like.
4 PAPERS • 3 BENCHMARKS
Multimodal Dyadic Behavior (MMDB) dataset is a unique collection of multimodal (video, audio, and physiological) recordings of the social and communicative behavior of toddlers. The MMDB contains 160 sessions of 3-5 minute semi-structured play interaction between a trained adult examiner and a child between the age of 15 and 30 months. The MMDB dataset supports a novel problem domain for activity recognition, which consists of the decoding of dyadic social interactions between adults and children in a developmental context.
3 PAPERS • NO BENCHMARKS YET
BAFMD contains images posted on Twitter during the pandemic from around the world with more images from underrepresented race and age groups to mitigate the problem for the face mask detection task.
2 PAPERS • NO BENCHMARKS YET
CASIA-Face-Africa is a face image database which contains 38,546 images of 1,183 African subjects. Multi-spectral cameras are utilized to capture the face images under various illumination settings. Demographic attributes and facial expressions of the subjects are also carefully recorded. For landmark detection, each face image in the database is manually labeled with 68 facial keypoints. A group of evaluation protocols are constructed according to different applications, tasks, partitions and scenarios. The proposed database along with its face landmark annotations, evaluation protocols and preliminary results form a good benchmark to study the essential aspects of face biometrics for African subjects, especially face image preprocessing, face feature analysis and matching, facial expression recognition, sex/age estimation, ethnic classification, face image generation, etc.
A 360-degree fisheye-like version of the popular FDDB face detection dataset.
The Hochschule Darmstadt (HDA) facial tattoo and paintings database contains 500 pairs of facial images of individuals with and without facial tattoos or paintings. The database was collected from multiple online sources.
The Human-Parts dataset is a dataset for human body, face and hand detection with ~15k images. It contains ~106k different annotations, with multiple annotations per image.
FAD is a dataset that have roughly 200,000 attribute labels for the above traits, for over 10,000 facial images.
1 PAPER • NO BENCHMARKS YET
Consists of a large number of unconstrained multi-view and partially occluded faces.
Dataset originally conceived for multi-face tracking/detection for highly crowded scenarios. In these scenarios, the face is the only part that can be used to track the individuals.
MobiFace is the first dataset for single face tracking in mobile situations. It consists of 80 unedited live-streaming mobile videos captured by 70 different smartphone users in fully unconstrained environments. Over 95K bounding boxes are manually labelled. The videos are carefully selected to cover typical smartphone usage. The videos are also annotated with 14 attributes, including 6 newly proposed attributes and 8 commonly seen in object tracking.
In the last two years, millions of lives have been lost due to COVID-19. Despite the vaccination programmes for a year, hospitalization rates and deaths are still high due to the new variants of COVID-19. Stringent guidelines and COVID-19 screening measures such as temperature check and mask check at all public places are helping reduce the spread of COVID-19. Visual inspections to ensure these screening measures can be taxing and erroneous. Automated inspection ensures an effective and accurate screening.
Procedural Human Action Videos contains a total of 39,982 videos, with more than 1,000 examples for each action of 35 categories.
In this paper, we introduce a victim dataset for the RoboCup Rescue competitions. The RoboCup Rescue robots have to collect points within several disciplines, e.g. a search task within an area to survey simulated baby doll (victim).
Unconstrained Face Detection and Open-Set Face Recognition Challenge
Description: 1,995 People Face Images Data (Asian race). For each subject, more than 20 images per person with frontal face were collected. This data can be used for face recognition and other tasks.
0 PAPER • NO BENCHMARKS YET
Description: 23 Pairs of Identical Twins Face Image Data. The collecting scenes includes indoor and outdoor scenes. The subjects are Chinese males and females. The data diversity inlcudes multiple face angles, multiple face postures, close-up of eyes, multiple light conditions and multiple age groups. This dataset can be used for tasks such as twins' face recognition.
Description: 5,011 Images – Human Frontal face Data (Male). The data diversity includes multiple scenes, multiple ages and multiple races. This dataset includes 2,004 Caucasians , 3,007 Asians. This dataset can be used for tasks such as face detection, race detection, age detection, beard category classification.
The free Face dataset made for students and teachers. It contains 10,000 photos with equal distribution of race and gender parameters, along with metadata and facial landmarks. Free to use for research with citation Photos by Generated.Photos.
Facial landmark detection is a cornerstone in many facial analysis tasks such as face recognition, drowsiness detection, and facial expression recognition. Numerous methodologies were introduced to achieve accurate and efficient facial landmark localization in visual images. However, there are only several works that address facial landmark detection in thermal images. The main challenge is the limited number of annotated datasets. In this work, we present a thermal face dataset with annotated face bounding boxes and facial landmarks. The dataset contains 2,556 thermal images of 142 individuals, where each thermal image is paired with the corresponding visual image. To the best of our knowledge, our dataset is the largest in terms of the number of individuals. In addition, our dataset can be employed for tasks such as thermal-to-visual image translation, thermal-visual face recognition, and others. We trained two models for the facial landmark detection task to show the efficacy of our
Face detection and subsequent localization of facial landmarks are the primary steps in many face applications. Numerous algorithms and benchmark datasets have been introduced to develop robust models for the visible domain. However, varying conditions of illumination still pose challenging problems. In this regard, thermal cameras are employed to address this problem, because they operate on longer wavelengths. However, thermal face and facial landmark detection in the wild is an open research problem because most of the existing thermal datasets were collected in controlled environments. In addition, many of them were not annotated with face bounding boxes and facial landmarks. In this work, we present a thermal face dataset with manually labeled bounding boxes and facial landmarks to address these problems. The dataset contains 9,982 images of 147 subjects collected under controlled and uncontrolled conditions. As a baseline, we trained the YOLOv5 object detection model and its adap