Object Discovery
74 papers with code • 0 benchmarks • 2 datasets
Object Discovery is the task of identifying previously unseen objects.
Source: Unsupervised Object Discovery and Segmentation of RGBD-images
Benchmarks
These leaderboards are used to track progress in Object Discovery
Most implemented papers
Object-Centric Learning with Slot Attention
Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features.
MONet: Unsupervised Scene Decomposition and Representation
The ability to decompose scenes in terms of abstract building blocks is crucial for general intelligence.
GuessWhat?! Visual object discovery through multi-modal dialogue
Our key contribution is the collection of a large-scale dataset consisting of 150K human-played games with a total of 800K visual question-answer pairs on 66K images.
Learn To Pay Attention
We propose an end-to-end-trainable attention module for convolutional neural network (CNN) architectures built for image classification.
Learning Open-World Object Proposals without Learning to Classify
In this paper, we identify that the problem is that the binary classifiers in existing proposal methods tend to overfit to the training categories.
Vision Transformers Need Registers
Transformers have recently emerged as a powerful tool for learning visual representations.
Efficient Dialog Policy Learning via Positive Memory Retention
This paper is concerned with the training of recurrent neural networks as goal-oriented dialog agents using reinforcement learning.
GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations
Generative latent-variable models are emerging as promising tools in robotics and reinforcement learning.
Localizing Objects with Self-Supervised Transformers and no Labels
We also show that training a class-agnostic detector on the discovered objects boosts results by another 7 points.
Unsupervised Image Decomposition with Phase-Correlation Networks
The ability to decompose scenes into their object components is a desired property for autonomous agents, allowing them to reason and act in their surroundings.