Fine-Grained Visual Recognition
35 papers with code • 0 benchmarks • 5 datasets
Benchmarks
These leaderboards are used to track progress in Fine-Grained Visual Recognition
Libraries
Use these libraries to find Fine-Grained Visual Recognition models and implementationsMost implemented papers
Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset
In this work we explore the task of instance segmentation with attribute localization, which unifies instance segmentation (detect and segment each object instance) and fine-grained visual attribute categorization (recognize one or multiple attributes).
Bilinear CNNs for Fine-grained Visual Recognition
We then present a systematic analysis of these networks and show that (1) the bilinear features are highly redundant and can be reduced by an order of magnitude in size without significant loss in accuracy, (2) are also effective for other image classification tasks such as texture and scene recognition, and (3) can be trained from scratch on the ImageNet dataset offering consistent improvements over the baseline architecture.
Retrieving Similar E-Commerce Images Using Deep Learning
In this paper, we propose a deep convolutional neural network for learning the embeddings of images in order to capture the notion of visual similarity.
Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization
The proposed methods are highly modular, readily plugged into existing deep CNNs.
Metric Learning with Adaptive Density Discrimination
Beyond classification, we further validate the saliency of the learnt representations via their attribute concentration and hierarchy recovery properties, achieving 10-25% relative gains on the softmax classifier and 25-50% on triplet loss in these tasks.
Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition
Fine-grained visual recognition is challenging because it highly relies on the modeling of various semantic parts and fine-grained feature learning.
MILDNet: A Lightweight Single Scaled Deep Ranking Architecture
Inspired by the fact that successive CNN layers represent the image with increasing levels of abstraction, we compressed our deep ranking model to a single CNN by coupling activations from multiple intermediate layers along with the last layer.
X-Linear Attention Networks for Image Captioning
Recent progress on fine-grained visual recognition and visual question answering has featured Bilinear Pooling, which effectively models the 2$^{nd}$ order interactions across multi-modal inputs.
Feathers dataset for Fine-Grained Visual Categorization
This paper introduces a novel dataset FeatherV1, containing 28, 272 images of feathers categorized by 595 bird species.
CLIP-Art: Contrastive Pre-training for Fine-Grained Art Classification
Existing computer vision research in artwork struggles with artwork's fine-grained attributes recognition and lack of curated annotated datasets due to their costly creation.