DeepFashion is a dataset containing around 800K diverse fashion images with their rich annotations (46 categories, 1,000 descriptive attributes, bounding boxes and landmark information) ranging from well-posed product images to real-world-like consumer photos.
362 PAPERS • 6 BENCHMARKS
VITON was a dataset for virtual try-on of clothing items. It consisted of 16,253 pairs of images of a person and a clothing item .
75 PAPERS • 1 BENCHMARK
Fashion IQ support and advance research on interactive fashion image retrieval. Fashion IQ is the first fashion dataset to provide human-generated captions that distinguish similar pairs of garment images together with side-information consisting of real-world product descriptions and derived visual attribute labels for these images.
62 PAPERS • 5 BENCHMARKS
VITON-HD dataset is a dataset for high-resolution (i.e., 1024x768) virtual try-on of clothing items. Specifically, it consists of 13,679 frontal-view woman and top clothing image pairs.
42 PAPERS • 1 BENCHMARK
Enables detailed human body model reconstruction in clothing from a single monocular RGB video without requiring a pre scanned template or manually clicked points.
33 PAPERS • NO BENCHMARKS YET
A novel benchmark and dataset for the evaluation of image-based garment reconstruction systems. Deep Fashion3D contains 2078 models reconstructed from real garments, which covers 10 different categories and 563 garment instances. It provides rich annotations including 3D feature lines, 3D body pose and the corresponded multi-view real images. In addition, each garment is randomly posed to enhance the variety of real clothing deformations.
24 PAPERS • NO BENCHMARKS YET
Dress Code is a new dataset for image-based virtual try-on composed of image pairs coming from different catalogs of YOOX NET-A-PORTER. The dataset contains more than 50k high resolution model clothing images pairs divided into three different categories (i.e. dresses, upper-body clothes, lower-body clothes).
12 PAPERS • NO BENCHMARKS YET
Contains 60 female and 30 male actors performing a collection of 20 predefined everyday actions and sports movements, and one self-chosen movement.
10 PAPERS • 1 BENCHMARK
Consists of 37,723/14,360 person/clothes images, with the resolution of 256x192. Each person has different poses. We split them into the train/test set 52,236/10,544 three-tuples, respectively. You can download the dataset at MPV(Google Drive)
4 PAPERS • 1 BENCHMARK
Spatial TRAnsformation for virtual Try-on (STRAT) dataset contains three subdatasets: STRAT-glasses, STRAT-hat, and STRAT-tie, which correspond to "glasses try-on", "hat try-on", and "tie try-on" respectively. In each subdataset, the training set has 2000 pairs of foregrounds (accessories) and backgrounds (human faces or portrait images), while the test set has 1000 pairs of foregrounds and backgrounds. For each pairwise sample, both the vertice coordinates and warpping parameters of the foreground for each pairwise are provided for supervised learning and evaluation of spatial transformation.
1 PAPER • NO BENCHMARKS YET
StreetTryOn, the new in-the-wild Virtual Try-On dataset, consists of 12,364 and 2,089 street person images for training and validation, respectively. It is derived from the large fashion retrieval dataset DeepFashion2, from which we filter out over 90% of DeepFashion2 images that are infeasible for try-on tasks (e.g., non-frontal view, large occlusion, dark environment, etc.). Combining with the garment and person images in VITON-HD, we obtain a comprehensive suite of in-domain and cross-domain try-on tasks that have garment and person inputs from various sources, including Shop2Model, Model2Model, Shop2Street, and Street2Street.
1 PAPER • 4 BENCHMARKS