Oleg Zabluda's blog
Tuesday, September 27, 2016
 
You Only Look Once: Unified, Real-Time Object Detection (2016) Joseph Redmon, [...] Ross Girshick et al
You Only Look Once: Unified, Real-Time Object Detection (2016) Joseph Redmon, [...] Ross Girshick et al
"""
Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. [...] can be optimized end-to-end directly on detection performance. [...] extremely fast. [...] 45 frames per second. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more localization errors but is far less likely to predict false detections where nothing exists. Finally, YOLO learns very general representations of objects. It outperforms all other detection methods, including DPM and R-CNN, by a wide margin when generalizing from natural images to artwork on both the Picasso Dataset and the People-Art Dataset.
[...]
Table 1: Real-Time Systems on PASCAL VOC 2007. Comparing
the performance and speed of fast detectors.

R-CNN minus R replaces Selective Search with static bounding box proposals [20]. While it is much faster than R-CNN, it still falls short of real-time and takes a significant accuracy hit from not having good proposals. Fast R-CNN speeds up the classification stage of R-CNN but it still relies on selective search which can take around 2 seconds per image to generate bounding box proposals. Thus it has high mAP but at 0.5 fps it is still far from realtime. The recent Faster R-CNN replaces selective search with a neural network to propose bounding boxes, similar to Szegedy et al. [8] In our tests, their most accurate model achieves 7 fps while a smaller, less accurate one runs at 18 fps. The VGG-16 version of Faster R-CNN is 10 mAP higher but is also 6 times slower than YOLO. The ZeilerFergus Faster R-CNN is only 2.5 times slower than YOLO but is also less accurate.
"""
https://arxiv.org/abs/1506.02640
https://arxiv.org/abs/1506.02640

Labels:


| |

Home

Powered by Blogger