Preview

visual categorization with bag of keypoints

Powerful Essays
Open Document
Open Document
4807 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
visual categorization with bag of keypoints
Visual Categorization with Bags of Keypoints
Gabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, Cédric Bray
Xerox Research Centre Europe
6, chemin de Maupertuis
38240 Meylan, France
{gcsurka,cdance}@xrce.xerox.com

Abstract. We present a novel method for generic visual categorization: the problem of identifying the object content of natural images while generalizing across variations inherent to the object class. This bag of keypoints method is based on vector quantization of affine invariant descriptors of image patches.
We propose and compare two alternative implementations using different classifiers: Naïve Bayes and SVM. The main advantages of the method are that it is simple, computationally efficient and intrinsically invariant. We present results for simultaneously classifying seven semantic visual categories. These results clearly demonstrate that the method is robust to background clutter and produces good categorization accuracy even without exploiting geometric information.

1. Introduction
The proliferation of digital imaging sensors in mobile phones and consumer-level cameras is producing a growing number of large digital image collections. To manage such collections it is useful to have access to high-level information about objects contained in the image. Given an appropriate categorization of image contents, one may efficiently search, recommend, react to or reason with new image instances.
We are thus confronted with the problem of generic visual categorization. We should like to identify processes that are sufficiently generic to cope with many object types simultaneously and which are readily extended to new object types. At the same time, these processes should handle the variations in view, imaging, lighting and occlusion, typical of the real world, as well as the intra-class variations typical of semantic classes of everyday objects.
The task-dependent and evolving nature of visual categories



References: [1] E. Osuna, R. Freund, F and Girosi. Training support vector machines: An application to face detection, CVPR (Computer Vision and Pattern Recognition), 1997. [2] C. Papageorgiou, T. Evgeniou and T. Poggio. A trainable pedestrian detection system, IEEE Conference on Intelligent Vehicles, 1998. [3] H. Schneiderman and T. Kanade, "A Statistical method for 3D object detection applied to faces and cars", CVPR, 2000. [4] P. Viola and M. Jones, Rapid object detection using a boosted cascade of simple features, CVPR, 2001 [5] S.Z. Li, L. Zhu, Z.Q. Zhang, A. Blake, H.J. Zhang and H. Shum, Statistical learning of multi-view face detection, ECCV (European Conference on Computer Vision), 2002. [7] T. Joachims. Text categorization with support vector machines: Learning with many relevant features, ECML, 1998. [10] N. Cristianini, J.Shawe-Taylor and H. Lodhi, Latent Semantic Kernels, Journal of Intelligent Information Systems, 18 (2), 127-152, 2002. [11] L. Zhu, A. Rao and A. Zhang, Theory of Keyblock-based image retrieval, ACM Transactions on Information Systems, 20, (2), 224-257, 2002. [17] T. Lindenberg, Scale-space theory in computer vision, Kluwer Academic Publishers, 1994. [18] D. G. Lowe, Object Recognition from local scale–invariant features, ICCV (International Conference on Computer Vision), 1999. [19] J. Matas, J. Burianek, and J. Kittler. Object recognition using the invariant pixel-set signature, BMVC (British Machine Vision Conference), 2000. [20] F. Schaffalitzky and A. Zisserman. Viewpoint invariant texture matching and wide baseline stereo, ICCV, 2001. [21] K. Mikolajczyk and C. Schmid. An affine invariant interest point detector, ECCV, 2002. [22] K. Mikolajczyk and C. Schmid, A performance evaluation of local descriptors, CVPR, 2003. [23] O. Duda, P.E. Hart, D.G. Stork, Pattern classification, John Wiley & Sons, 2000. [24] D. Pelleg and A. Moore. X-Means: Extending K-means with Efficient Estimation of the Number of Clusters, International Conference on Machine Learning, 2000. [25] V. Vapnik. Statistical Learning Theory. Wiley, 1998 [26] D [27] P. Domingos and M. Pazzani, On the optimality of simple Bayesian classifier under zeroone loss, Machine Learning, 29, 1997.

You May Also Find These Documents Helpful

  • Satisfactory Essays

    Pt1420 Unit 1 Assignment

    • 303 Words
    • 2 Pages

    IBM Multimedia Analysis and Retrieval System [8]. The service enabled users to train new classifiers in December 2015.…

    • 303 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    Parts Key in on all of the parts by noting any details that seem important. This can be anything: color, figures, textures, scenery, groupings, shadings, patterns, numbers, etc.…

    • 526 Words
    • 3 Pages
    Good Essays
  • Powerful Essays

    References: Albert, J., & Rossman, A. (2001). Workshop statistics: Discovery with data, a Bayesian approach. Boston, MA: Springer.…

    • 4156 Words
    • 14 Pages
    Powerful Essays
  • Powerful Essays

    The problem with this concept is the difficulty of measuring the features would change as the face is tilted or rotated. Lighting was also a major factor in the accuracy of measuring the features to match a 2D photo.…

    • 3118 Words
    • 13 Pages
    Powerful Essays
  • Powerful Essays

    Vision Res. 37, 2835– 2848 16 Foster, D.H. (1979) Discrete internal pattern representations and visual detection of small changes in pattern shape. Percept. Psychophys. 26, 459 – 468 17 Pizlo, Z. et al. (1997) Curve detection in a noisy image. Vision Res. 37, 1217– 1241 18 Smits, J.T. and Vos, P.G. (1987) The perception of continuous curves in dot stimuli. Perception 16, 121– 131 19 Feldman, J. (2000) Bias toward regular form in mental shape spaces. J. Exp. Psychol. Hum. Percept. Perform. 26, 1 – 14 20 Kanizsa, G. (1979) Organization in Vision: Essays on Gestalt Perception, Praeger Publishers 21 Treisman, A. (1986) Properties, parts and objects. In Handbook of Perception and Human Performance: Cognitive Processes and Performance (Vol. 2) (Boff, K.R., ed.), pp. 35-1 – 35-70, John Wiley and Sons 22 Baylis, G. and Driver, J. (1993) Visual attention and objects: evidence for hierarchical coding of location. J. Exp. Psychol. Hum. Percept. Perform. 19, 451– 470 23 Geisler, W.S. and Super, B.J. (2000) Perceptual organization of twodimensional patterns. Psychol. Rev. 107, 677 – 708 24 Palmer, S.E. (1977) Hierarchical structure in perceptual representation. Cogn. Psychol. 9, 441 – 474 25 Marr, D. and Nishihara, H.K. (1978) Representation and recognition of the spatial organization of three-dimensional shapes. Proc. R. Soc. Lond. Ser. B 200, 269 – 294 26 Amir, A. and Lindenbaum, M. (1998) A generic grouping algorithm and its quantitative analysis. IEEE Trans. Patt. Anal. Mach. Intell. 20, 168– 185 27 Shi, J. and Malik, M. (2000) Normalized cuts and image segmentation.…

    • 4058 Words
    • 17 Pages
    Powerful Essays
  • Powerful Essays

    Visual Perception is an efficient and flexible process( Lin, Lin, & Han, 2008) within the eye that assists humans and animals in acquiring information about their settings by detecting light that is reflected from surfaces, allowing individuals to understand what objects are present and the appropriate behaviour to respond in (Yantis, 2001). Understanding perception and the types of processes that are involved is vital in determining whether we perceive a scene feature by feature or whether the process is immediate upon the visualisation of the object (Navon, 1977). The methods involved in the processing of visual input selects the information that is worth receiving and attending to and focuses more attention on recognising the objects and features of that input. Many studies have focused their research on the hierarchical levels of perception as the focus of the manner in which visual processing is carried out, were larger features are…

    • 2335 Words
    • 10 Pages
    Powerful Essays
  • Good Essays

    shapes, they identify and use patterns and properties. For example, they notice that all shapes with three sides have…

    • 801 Words
    • 4 Pages
    Good Essays
  • Satisfactory Essays

    Subjects were shown five pairs of Consistent and Inconsistent pictures. An example of a Consistent picture is a ship in a sea, whereas an Inconsistent picture shows cars on the surface of an ocean. Each picture was separately developed into a movie clipping, visible to a subject just for a fraction of a second. First, a subject was shown a Consistent picture and asked to describe what he/she saw in the picture, especially in the foreground. He was then shown an Inconsistent picture and asked the same thing. A correct answer was marked with ‘1’ and an incorrect one with ‘2’.…

    • 548 Words
    • 3 Pages
    Satisfactory Essays
  • Powerful Essays

    Peissig, J.J., & Tarr, M.J. (2007). Visual object recognition: Do we know more now than we did 20 years ago? Annual review of Psychology, 58, 75-96.…

    • 2362 Words
    • 10 Pages
    Powerful Essays
  • Better Essays

    Acrostic Poems

    • 989 Words
    • 4 Pages

    This lesson will also provide practice categorizing a variety of objects according to observable characteristics.…

    • 989 Words
    • 4 Pages
    Better Essays
  • Good Essays

    Echolocation

    • 1073 Words
    • 5 Pages

    understanding the interrelationships of these qualities, much can be perceived about the nature of an object or multiple objects. For example, an object that is tall and narrow may be recognized…

    • 1073 Words
    • 5 Pages
    Good Essays
  • Satisfactory Essays

    Pre-operational 2 – 7 classifies objects by a single feature, for example groups together the same colour building blocks.…

    • 336 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    1. Introduction A problem of estimating the quality of attributes (features) is an important issue in the machine learning. There are several important tasks in the process of machine learning e.g., feature subset selection, constructive…

    • 20047 Words
    • 81 Pages
    Powerful Essays
  • Good Essays

    The CRISP-DM Case Study

    • 872 Words
    • 4 Pages

    Classification is the derivation of a function or model which determines the class of an object based on its attributes. A set of objects is given as the training set in which every object is represented by a vector of attributes along with its class. The examples of classification model can be used to diagnose a new patient’s disease based on the patient’s diagnostic data such as age, sex, weight, temperature and blood pressure.…

    • 872 Words
    • 4 Pages
    Good Essays
  • Powerful Essays

    digital image processing

    • 4291 Words
    • 34 Pages

    The aim of image fusion, apart from reducing the amount of data, is to create new…

    • 4291 Words
    • 34 Pages
    Powerful Essays

Related Topics