Image Retrieval

666 papers with code • 54 benchmarks • 75 datasets

Image Retrieval is a fundamental and long-standing computer vision task that involves finding images similar to a provided query from a large database. It's often considered as a form of fine-grained, instance-level classification. Not just integral to image recognition alongside classification and detection, it also holds substantial business value by helping users discover images aligning with their interests or requirements, guided by visual similarity or other parameters.

( Image credit: DELF )

Libraries

Use these libraries to find Image Retrieval models and implementations
2 papers
9,377
2 papers
8,724
See all 10 libraries.

Most implemented papers

Emerging Properties in Self-Supervised Vision Transformers

facebookresearch/dino ICCV 2021

In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets).

VGGFace2: A dataset for recognising faces across pose and age

deepinsight/insightface 23 Oct 2017

The dataset was collected with three goals in mind: (i) to have both a large number of identities and also a large number of images for each identity; (ii) to cover a large range of pose, age and ethnicity; and (iii) to minimize the label noise.

NetVLAD: CNN architecture for weakly supervised place recognition

Relja/netvlad CVPR 2016

We tackle the problem of large scale visual place recognition, where the task is to quickly and accurately recognize the location of a given query photograph.

Fine-tuning CNN Image Retrieval with No Human Annotation

filipradenovic/cnnimageretrieval-pytorch 3 Nov 2017

We show that both hard-positive and hard-negative examples, selected by exploiting the geometry and the camera positions available from the 3D models, enhance the performance of particular-object retrieval.

Large-Scale Image Retrieval with Attentive Deep Local Features

tensorflow/models ICCV 2017

We propose an attentive local feature descriptor suitable for large-scale image retrieval, referred to as DELF (DEep Local Feature).

Circle Loss: A Unified Perspective of Pair Similarity Optimization

layumi/Person_reID_baseline_pytorch CVPR 2020

This paper provides a pair similarity optimization viewpoint on deep feature learning, aiming to maximize the within-class similarity $s_p$ and minimize the between-class similarity $s_n$.

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

salesforce/lavis 30 Jan 2023

The cost of vision-and-language pre-training has become increasingly prohibitive due to end-to-end training of large-scale models.

ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks

facebookresearch/vilbert-multi-task NeurIPS 2019

We present ViLBERT (short for Vision-and-Language BERT), a model for learning task-agnostic joint representations of image content and natural language.

DINOv2: Learning Robust Visual Features without Supervision

facebookresearch/dinov2 14 Apr 2023

The recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened the way for similar foundation models in computer vision.

VSE++: Improving Visual-Semantic Embeddings with Hard Negatives

fartashf/vsepp 18 Jul 2017

We present a new technique for learning visual-semantic embeddings for cross-modal retrieval.