Image Retrieval
Image retrieval is a fundamental and enduring computer vision task aimed at finding images similar to a given query image from a large database. This task is often regarded as a form of fine-grained, instance-level classification, where image retrieval can efficiently discover relevant images by leveraging visual similarity and other criteria, playing a crucial role in applications such as search and recommendation.
ROxford (Hard)
SuperGlobal
ROxford (Medium)
RParis (Hard)
Hypergraph propagation
RParis (Medium)
CREPE (Compositional REPresentation Evaluation)
ViT-L-14 (LAION400M)
Fashion IQ
Flickr30K 1K test
X-VLM (base)
CIRR
SPN4CIR
SOP
Unicom+ViT-L@336px
Flickr30k-CN
Oxf5k
iNaturalist
Smooth-AP
COCO-CN
Flickr30k
BLIP-2 ViT-L (zero-shot, 1K test set)
MUGE Retrieval
Oxf105k
CARS196
CGD (MG/SG)
CUB-200-2011
CGD (MG/SG)
In-Shop
CGD (SG/GS)
Par106k
Par6k
Offline Diffusion
MS COCO
Oscar
AmsterTime
AP-GeM (ResNet-101)
ConQA Conceptual
CLIP
ConQA Descriptive
PhotoChat
DeepFashion - Consumer-to-shop
CTL Model (ResNet50-IBN-A, 320x320)
DeepPatent
SwinV2
Google Landmarks Dataset v2 (retrieval, testing)
AMES
24/7 Tokyo
HED-N-GAN
Exact Street2Shop
RST Model (ResNet50-IBN-A, 320x320)
Google Landmarks Dataset v2 (retrieval, validation)
UNICOM-ViT-L-14-512px
LaSCo
CASE
MSCOCO
HADA
AIC-ICC
ERNIE-ViL2.0
CBVS
UniCLP
INRIA Holidays
MultiGrain R50 @ 800
Oxford5k
GNN-Reranking
Paris6k
IME layer
street2shop - topwear
Ranknet
WIT
WIT-ALL
CIFAR-10
Custom: 3 conv + 2 fcn
COFAR
KRAMT
DeepFashion
RCCapsNet
FETA Car-Manuals
FooDI-ML (Global)
ADAPT-I2T
FooDI-ML (Spain)
ICFG-PEDES
SSAN
ImageCoDe
ContextualCLIP
INSTRE
Localized Narratives
OPT
NUS-WIDE
DTQ
PKU-Reid
IHDA
PKU SketchRe-ID Dataset
IHDA
ROxford Medium without fine-tuning
HesAff–rSIFT–VLAD
RUC-CAS-WenLan
CMCL