Our dataset was made of videos from MSU Video Upscalers Benchmark Dataset, MSU Video Super-Resolution Benchmark Dataset and MSU Super-Resolution for Video Compression Benchmark Dataset. Dataset consists of real videos (were filmed with 2 cameras), video games footages, movies, cartoons, dynamic ads.
26 PAPERS • 1 BENCHMARK
TextZoom is a super-resolution dataset that consists of paired Low Resolution – High Resolution scene text images. The images are captured by cameras with different focal length in the wild.
25 PAPERS • 1 BENCHMARK
20 real low-resolution images selected from existing datasets or downloaded from internet
16 PAPERS • NO BENCHMARKS YET
DRealSR establishes a Super Resolution (SR) benchmark with diverse real-world degradation processes, mitigating the limitations of conventional simulated image degradation.
14 PAPERS • NO BENCHMARKS YET
QMUL-SurvFace is a surveillance face recognition benchmark that contains 463,507 face images of 15,573 distinct identities captured in real-world uncooperative surveillance scenes over wide space and time.
10 PAPERS • 1 BENCHMARK
An in-the-wild stereo image dataset, comprising 49,368 image pairs contributed by users of the Holopix mobile social platform.
9 PAPERS • NO BENCHMARKS YET
A dataset consisting of stereo thermal, stereo color, and cross-modality image pairs with high accuracy ground truth (< 2mm) generated from a LiDAR. The authors scanned 100 cluttered indoor and 80 outdoor scenes featuring challenging environments and conditions. CATS contains approximately 1400 images of pedestrians, vehicles, electronics, and other thermally interesting objects in different environmental conditions, including nighttime, daytime, and foggy scenes.
8 PAPERS • 2 BENCHMARKS
The PROBA-V Super-Resolution dataset is the official dataset of ESA's Kelvins competition for "PROBA-V Super Resolution". It contains satellite data from 74 hand-selected regions around the globe at different points in time. The data is composed of radiometrically and geometrically corrected Top-Of-Atmosphere (TOA) reflectances for the RED and NIR spectral bands at 300m and 100m resolution in Plate Carrée projection. The 300m resolution data is delivered as 128x128 grey-scale pixel images, the 100m resolution data as 384x384 grey-scale pixel images. Additionally, a quality map is provided for each pixel, indicating whether the pixels are concealed (i.e. by clouads, ice, water, missing information, etc.).
7 PAPERS • 1 BENCHMARK
The Stanford Light Field Archive is a collection of several light fields for research in computer graphics and vision.
7 PAPERS • NO BENCHMARKS YET
OST300 is an outdoor scene dataset with 300 test images of outdoor scenes, and a training set of 7 categories of images with rich textures.
6 PAPERS • NO BENCHMARKS YET
Alsat-2B is a remote sensing dataset of low and high spatial resolution images (10m and 2.5m respectively) for the single-image super-resolution task. The high-resolution images are obtained through pan-sharpening. The dataset has been created from 13 images captured by the Alsat-2B Earth observation satellite, where the image cover 13 different cities.
2 PAPERS • NO BENCHMARKS YET
A large-scale multi-scene dataset for stereo deblurring, containing 20,637 blurry-sharp stereo image pairs from 135 diverse sequences and their corresponding bidirectional disparities.
StereoMSI comprises of 350 registered colour-spectral image pairs. The dataset has been used for the two tracks of the PIRM2018 challenge.
Digitally Generated Numerals (DIGITal) Description The Digitally Generated Numerals (DIGITal) dataset consists of 100,000 image pairs representing digits from 0 to 9. These image pairs include both low and high-quality versions, with a resolution of 128x128 pixels.
1 PAPER • NO BENCHMARKS YET
The INRIA Dense Light Field Dataset (DLFD) is a dataset for testing depth estimation methods in a light field. DLFD contains 39 scenes with disparity range [-4,4] pixels. The light fields are of spatial resolution 512 x 512 and angular resolution 9 x 9.
Advanced pixel shift technology is employed to perform a full color sampling of the image. Pixel shift technology takes four samples of the same image at nearly the same time, and physically controls the camera sensor to move one pixel horizontally or vertically at each sampling to capture all color information at each pixel. The pixel shift technology ensures that the sampled images follow the distribution of natural images sampled by the camera, and the full information of the color (R, Gr, Gb, B channel) is completely obtained without any need of interpolation. In this way, the collected RGB images are artifacts-free, which leads to better training results for demosaicing related tasks.
The availability of well-curated datasets has driven the success of Machine Learning (ML) models. Despite greater access to earth observation data in agriculture, there is a scarcity of curated and labelled datasets, which limits the potential of its use in training ML models for remote sensing (RS) in agriculture. To this end, we introduce a first-of-its-kind dataset called SICKLE, which constitutes a time-series of multi-resolution imagery from 3 distinct satellites: Landsat-8, Sentinel-1 and Sentinel-2. Our dataset constitutes multi-spectral, thermal and microwave sensors during January 2018 - March 2021 period. We construct each temporal sequence by considering the cropping practices followed by farmers primarily engaged in paddy cultivation in the Cauvery Delta region of Tamil Nadu, India; and annotate the corresponding imagery with key cropping parameters at multiple resolutions (i.e. 3m, 10m and 30m). Our dataset comprises 2, 370 season-wise samples from 388 unique plots, having
1 PAPER • 5 BENCHMARKS
A collection of photographic and synthetic images intended for analysis of image processing techniques and quality assessment of displays.
Description K-pop Idol Dataset - Female (KID-F) is the first dataset of K-pop idol high quality face images. It consists of about 6,000 high quality face images at 512x512 resolution and identity labels for each image.
0 PAPER • NO BENCHMARKS YET
We propose a new light field image database called “PINet” inheriting the hierarchical structure from WordNet. It consists of 7549 LIs captured by Lytro Illum, which is much larger than the existing databases. The images are manually annotated to 178 categories according to WordNet, such as cat, camel, bottle, fans, etc. The registered depth maps are also provided. Each image is generated by processing the raw LI from the camera by Light Field Toolbox v0.4 for demosaicing and devignetting.