Discriminative And-Or Graph Learning

发布人:吕梅

Papers

  • Liang Lin, Xiaolong Wang, Wei Yang, and JianHuang Lai, Discriminatively Trained And-Or Graph Models for Object Shape Detection, IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), DOI: 10.1109/TPAMI.2014.2359888, 2014. [PDF]
  • Liang Lin, Xiaolong Wang, Wei Yang, and Jian-Huang Lai, Learning Contour-Fragment-based Shape Model with And-Or Tree Representation. Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012.Paper Dataset Code

 Introduction | Experiments | Download Code and Database | References

Introduction

29_1

 

This paper investigates a novel reconfigurable part-based model, namely And-Or graph model, to recognize object shapes in images, as Fig. 1 shows.

The proposed model consists of four layers:

  • Leaf-nodes: local classifiers for detecting contour fragments;
  • Or nodes: switches to activate one of its child leaf-nodes, making the model reconfigurable during inference;
  • And-nodes capture holistic shape deformations;
  • Root-node is also an or-node, which activates one of its child and-nodes to deal with large global variations (e.g. different poses and views).

We discriminatively train the And-Or model from weakly annotated data by proposing a non-convex optimization algorithm. This algorithm iteratively determines the latent model structures (e.g. the nodes and their layouts) along with the parameter learning.

Experiments

We validate our model on a new shape database, SYSU-Shapes, as well as other two public databases: UIUCPeople [1] and INRIA-Horse [2], and show the superior performances over the state-of-the-art methods.

Experiment I: SYSU-Shape database

29_2

29_3

 

29_4

Experiment II: UIUC-People dataset

29_6

29_7

29_8

Experiment III: INRIA Horse dataset

29_9

Experiment IV (Download PPT)

29_10

It would be good to generate visualizations to help understand what is being learned by various leaf nodes for various parts. One way to do this is simply visualize image patches across the training data for any given (Or-node, Leaf Node) combination.
For example, suppose you are training a horse detector. Lets say you have a Or-node associated with the head of the horse. The “head” node has various leaves to account for changes in appearance of the head. For each of the leaves, keep a track of training images on which that leaf fires.

Download the code and database

We built a new shape database called SYSU-Shapes, which includes elaborately annotated shape contours. There are 5 categories, i.e. airplanes, boats, cars, motorbikes, and bicycles, and each category contains 200 ∼ 500 images. The shape contours are carefully labeled by using the LabelMe toolkit. It is worth mentioning that each image has at least but not limit to one object of a given category.

Dataset Code

References

  1. D. Tran and D. Forsyth, Improved human parsing with a full relational model, In Proc. of European Conference on Computer Vision (ECCV), 2010.
  2. F. Jurie and C. Schmid, Scale-invariant Shape Features for Recognition of Object Categories, In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2004.
  3. Y. Wang, D. Tran, and Z. Liao, Learning Hierarchical Poselets for Human Parsing, In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011.
  4. M. Andriluka, S. Roth, and B. Schiele, Pictorial structures revisited: People detection and articulated pose estimation, In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009.
  5. P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan, Object Detection with Discriminatively Trained Part-based Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9): 1627-1645, 2010.
  6. L. Bourdev, S. Maji, T. Brox, and J. Malik, Detecting people using mutually consistent poselet activations, In Proc. of European Conference on Computer Vision (ECCV), 2010.