Real-Time Object Detection
107 papers with code • 7 benchmarks • 8 datasets
Real-Time Object Detection is a computer vision task that involves identifying and locating objects of interest in real-time video sequences with fast inference while maintaining a base level of accuracy.
This is typically solved using algorithms that combine object detection and tracking techniques to accurately detect and track objects in real-time. They use a combination of feature extraction, object proposal generation, and classification to detect and localize objects of interest.
( Image credit: CenterNet )
Libraries
Use these libraries to find Real-Time Object Detection models and implementationsDatasets
Most implemented papers
YOLOv3: An Incremental Improvement
At 320x320 YOLOv3 runs in 22 ms at 28. 2 mAP, as accurate as SSD but three times faster.
YOLO9000: Better, Faster, Stronger
On the 156 classes not in COCO, YOLO9000 gets 16. 0 mAP.
Focal Loss for Dense Object Detection
Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
YOLOv4: Optimal Speed and Accuracy of Object Detection
There are a huge number of features which are said to improve Convolutional Neural Network (CNN) accuracy.
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals.
Mask R-CNN
Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance.
You Only Look Once: Unified, Real-Time Object Detection
A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation.
CSPNet: A New Backbone that can Enhance Learning Capability of CNN
Neural networks have enabled state-of-the-art approaches to achieve incredible results on computer vision tasks such as object detection.
Objects as Points
We model an object as a single point --- the center point of its bounding box.
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision.