A New Baseline for Pedestrian Detection

发布人:吕梅

Paper

Liliang Zhang, Liang Lin*, Xiaodan Liang, Kaiming He, “Is Faster R-CNN Doing Well for Pedestrian Detection?”, ECCV 2016. PDF Code

 

Abstract

 Detecting pedestrian has been arguably addressed as a special topic beyond general object detection. Although recent deep learning object detectors such as Fast/Faster R-CNN [1,2] have shown excellent performance for general object detection, they have limited success for detecting pedestrian, and previous leading pedestrian detectors were in general hybrid methods combining hand-crafted and deep convolutional features. In this paper, we investigate issues involving Faster R-CNN [2] for pedestrian detection. We discover that the Region Proposal Network (RPN) in Faster R-CNN indeed performs well as a stand-alone pedestrian detector, but surprisingly, the downstream classifier degrades the results. We argue that two reasons account for the unsatisfactory accuracy: (i) insufficient resolution of feature maps for handling small instances, and (ii) lack of any bootstrapping strategy for mining hard negative examples. Driven by these observations, we propose a very simple but effective baseline for pedestrian detection, using an RPN followed by boosted forests on shared, high-resolution convolutional feature maps. We comprehensively evaluate this method on several benchmarks (Caltech, INRIA, ETH, and KITTI), presenting competitive accuracy and good speed. Code will be made publicly available.

39_1

 

Experiments

39_2

 

Fig.1: Comparisons on the Caltech set (legends indicate MR).

39_3

 

Fig.2: Comparisons on the Caltech set using IoU > 0.7 to determine True Positives (legends indicate MR).

39_4

 

Fig.3: Comparisons on the Caltech-New set (legends indicate MR−2 (MR−4)).

39_5

 

Fig.4: Comparisons on the INRIA dataset (legends indicate MR).

39_6

Fig.5: Comparisons on the ETH dataset (legends indicate MR).

39_7

Table 1: Comparisons on the KITTI dataset collected at the time of submission (Feb 2016). The timing records are collected from the KITTI leaderboard. †: region proposal running time ignored (estimated 2s).

 

References

  1.  Ross Girshick. Fast R-CNN. In IEEE International Conference on Computer Vision (ICCV), 2015. 
  2.  Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Neural Information Processing Systems (NIPS), 2015.