The 300-W is a face dataset that consists of 300 Indoor and 300 Outdoor in-the-wild images. It covers a large variation of identity, expression, illumination conditions, pose, occlusion and face size. The images were downloaded from google.com by making queries such as “party”, “conference”, “protests”, “football” and “celebrities”. Compared to the rest of in-the-wild datasets, the 300-W database contains a larger percentage of partially-occluded images and covers more expressions than the common “neutral” or “smile”, such as “surprise” or “scream”. Images were annotated with the 68-point mark-up using a semi-automatic methodology. The images of the database were carefully selected so that they represent a characteristic sample of challenging but natural face instances under totally unconstrained conditions. Thus, methods that achieve accurate performance on the 300-W database can demonstrate the same accuracy in most realistic cases. Many images of the database contain more than one a
198 PAPERS • 9 BENCHMARKS
The HELEN dataset is composed of 2330 face images of 400×400 pixels with labeled facial components generated through manually-annotated contours along eyes, eyebrows, nose, lips and jawline.
197 PAPERS • 1 BENCHMARK
AFW (Annotated Faces in the Wild) is a face detection dataset that contains 205 images with 468 faces. Each face image is labeled with at most 6 landmarks with visibility labels, as well as a bounding box.
154 PAPERS • 1 BENCHMARK
The Annotated Facial Landmarks in the Wild (AFLW) is a large-scale collection of annotated face images gathered from Flickr, exhibiting a large variety in appearance (e.g., pose, expression, ethnicity, age, gender) as well as general imaging and environmental conditions. In total about 25K faces are annotated with up to 21 landmarks per image.
151 PAPERS • 11 BENCHMARKS
The Labeled Face Parts in-the-Wild (LFPW) consists of 1,432 faces from images downloaded from the web using simple text queries on sites such as google.com, flickr.com, and yahoo.com. Each image was labeled by three MTurk workers, and 29 fiducial points, shown below, are included in dataset.
127 PAPERS • NO BENCHMARKS YET
AFLW2000-3D is a dataset of 2000 images that have been annotated with image-level 68-point 3D facial landmarks. This dataset is used for evaluation of 3D facial landmark detection models. The head poses are very diverse and often hard to be detected by a CNN-based face detector.
112 PAPERS • 8 BENCHMARKS
The Caltech Occluded Faces in the Wild (COFW) dataset is designed to present faces in real-world conditions. Faces show large variations in shape and occlusions due to differences in pose, expression, use of accessories such as sunglasses and hats and interactions with objects (e.g. food, hands, microphones, etc.). All images were hand annotated using the same 29 landmarks as in LFPW. Both the landmark positions as well as their occluded/unoccluded state were annotated. The faces are occluded to different degrees, with large variations in the type of occlusions encountered. COFW has an average occlusion of over 23.
110 PAPERS • 5 BENCHMARKS
The Wider Facial Landmarks in the Wild or WFLW database contains 10000 faces (7500 for training and 2500 for testing) with 98 annotated landmarks. This database also features rich attribute annotations in terms of occlusion, head pose, make-up, illumination, blur and expressions.
100 PAPERS • 4 BENCHMARKS
COCO-WholeBody is an extension of COCO dataset with whole-body annotations. There are 4 types of bounding boxes (person box, face box, left-hand box, and right-hand box) and 133 keypoints (17 for body, 6 for feet, 68 for face and 42 for hands) annotations for each person in the image.
22 PAPERS • 6 BENCHMARKS
300 Videos in the Wild (300-VW) is a dataset for evaluating facial landmark tracking algorithms in the wild. The dataset authors collected a large number of long facial videos recorded in the wild. Each video has duration of ~1 minute (at 25-30 fps). All frames have been annotated with regards to the same mark-up (i.e. set of facial landmarks) used in the 300 W competition as well (a total of 68 landmarks). The dataset includes 114 videos (circa 1 min each).
5 PAPERS • 2 BENCHMARKS
CASIA-Face-Africa is a face image database which contains 38,546 images of 1,183 African subjects. Multi-spectral cameras are utilized to capture the face images under various illumination settings. Demographic attributes and facial expressions of the subjects are also carefully recorded. For landmark detection, each face image in the database is manually labeled with 68 facial keypoints. A group of evaluation protocols are constructed according to different applications, tasks, partitions and scenarios. The proposed database along with its face landmark annotations, evaluation protocols and preliminary results form a good benchmark to study the essential aspects of face biometrics for African subjects, especially face image preprocessing, face feature analysis and matching, facial expression recognition, sex/age estimation, ethnic classification, face image generation, etc.
2 PAPERS • NO BENCHMARKS YET
High-resolution thermal infrared face database with extensive manual annotations, introduced by Kopaczka et al, 2018. Useful for training algoeithms for image processing tasks as well as facial expression recognition. The full database itself, all annotations and the complete source code are freely available from the authors for research purposes at https://github.com/marcinkopaczka/thermalfaceproject.
1 PAPER • NO BENCHMARKS YET
Facial landmark detection is a cornerstone in many facial analysis tasks such as face recognition, drowsiness detection, and facial expression recognition. Numerous methodologies were introduced to achieve accurate and efficient facial landmark localization in visual images. However, there are only several works that address facial landmark detection in thermal images. The main challenge is the limited number of annotated datasets. In this work, we present a thermal face dataset with annotated face bounding boxes and facial landmarks. The dataset contains 2,556 thermal images of 142 individuals, where each thermal image is paired with the corresponding visual image. To the best of our knowledge, our dataset is the largest in terms of the number of individuals. In addition, our dataset can be employed for tasks such as thermal-to-visual image translation, thermal-visual face recognition, and others. We trained two models for the facial landmark detection task to show the efficacy of our
0 PAPER • NO BENCHMARKS YET
Face detection and subsequent localization of facial landmarks are the primary steps in many face applications. Numerous algorithms and benchmark datasets have been introduced to develop robust models for the visible domain. However, varying conditions of illumination still pose challenging problems. In this regard, thermal cameras are employed to address this problem, because they operate on longer wavelengths. However, thermal face and facial landmark detection in the wild is an open research problem because most of the existing thermal datasets were collected in controlled environments. In addition, many of them were not annotated with face bounding boxes and facial landmarks. In this work, we present a thermal face dataset with manually labeled bounding boxes and facial landmarks to address these problems. The dataset contains 9,982 images of 147 subjects collected under controlled and uncontrolled conditions. As a baseline, we trained the YOLOv5 object detection model and its adap
Toronto NeuroFace Dataset: A New Dataset for Facial Motion Analysis in Individuals with Neurological Disorders