Recognition

Extended abstract

Recognition/classification of objects and patterns independent of their position, size, orientation and other variations in geometry and colors has been the goal of much recent research. Finding efficient invariant object descriptors is the key to solving this problem. Several groups of features have been used for this purpose, such as simple visual features (edges, contours, textures, etc.), Fourier and Hadamard coefficients, differential invariants, and moment invariants, among others.

This tutorial is devoted to the history, recent advances and prospective future development of moment invariants. They were firstly introduced to the pattern recognition community in 1962 by M.K. Hu, who employed the theory of algebraic invariants and derived his seven famous invariants to rotation of 2-D objects. Since that time, moment invariants have become a classical tool for feature-based object recognition. Many research papers have been devoted to various improvements and generalizations of the Hu's invariants and to their utilization in many application areas as well as to developing other systems of moment invariants.

This tutorial originates from 15-years speakers' experiences in moments, moment invariants, and related fields. The tutorial covers mainly the following topics.

Rotation moment invariants from higher-order moments

The original Hu's invariants utilized the second and third-order moments only. The construction of the invariants from higher-order moments is not straightforward. Fourier-Mellin transform, Zernike polynomials and algebraic invariants have been used for this purpose. We present here a general framework how to derive moment invariants of any order. This approach is based on complex moments. In comparison with the previous ones, our approach is more transparent and allows to study completeness and mutual dependence/independence of the invariants easily. We prove the existence of a relatively small set (basis) of the invariants by means of which all other invariants can be expressed. As an interesting consequence, we show that most of the previously published sets of moment invariants including the Hu's one are dependent. This is really a surprising result giving a new look at Hu's invariants and possibly yielding a new interpretation of some previous experimental work.

Affine moment invariants

In practice we often face object/image deformations that are beyond the rotation-translation-scaling model. An exact model of photographing a planar scene by a pin-hole camera whose optical axis is not perpendicular to the scene is a non-linear projective transform. For small objects and large camera-to-scene distance is the perspective effect negligible and the projective transform can be well approximated by an affine transform. Thus, having powerful affine moment invariants for object description and recognition is in great demand.

A pioneer work on this field was done independently by T.H. Reiss in 1991 and by Flusser and Suk in 1991, who corrected some mistakes in Hu's theory, introduced affine moment invariants (AMI's), and proved their applicability in simple recognition tasks. Their derivation of the AMI's originated from the classical theory of algebraic invariants (D. Hilbert, 1897). This was a significant step to overcome the limitations of a rigid-body transform. Since affine transform is not shape-preserving, the problem is much more difficult than in case of similarity transform. Later on, Suk and Flusser proposed AMI's generated by mans of graph theory. This approach allows to derive arbitrary number of invariants along
with elimination of reducible and dependent ones.

Invariants to convolution

An important class of image degradations, different from those affecting spatial coordinates, we are faced with often in practice is image blurring. Blurring can be caused by camera out of focus, atmospheric turbulence, vibrations, sensor and/or scene motion, and/or by interpolation-based enlargement of the image. If the scene is flat and the imaging system is linear and space invariant, blurring can be described by a convolution g(x,y) = (f * h)(x,y), where f is an original (ideal) image, g is an acquired image and h is a point spread function (PSF) of the imaging system. Since in most practical tasks the PSF is unknown, having the invariants to convolution is of prime importance when recognizing objects in a blurred scene. An alternative approach, that would not require convolution invariants, must include blind image deconvolution, which is an ill-posed and practically unsolvable problem.

A class of moment-based features invariant to convolution with an unknown point-spread function was firstly introduced by Flusser and Suk in 1995 and further developed by Flusser and Zitova. The invariants have found successful applications in face recognition on out-of-focused photographs, in normalizing blurred images into the canonical forms, in template-to-scene matching of satellite images , in blurred digit and character recognition, in registration of images obtained by digital subtraction angiography, and in focus/defocus quantitative measurement. Some of these applications will be presented in the tutorial.

Combined invariants

In this part of the tutorial we show how to obtain simultaneous invariance to image blurring, rotation, scaling, affine transform and contrast changes. Having these so-called combined invariants is often an urgent requirement in practice, where composite image degradations use to be present.

Orthogonal moments

Orthogonal moments are projections of the image into a set of orthogonal polynomials. They have been studied namely because of their favorable reconstruction properties. We present Legendre, Zernike, and Fourier Mellin moments and show their relationship to basic geometric moments.

Algorithms for moment computation

Since computing complexity of all moment invariants depends almost solely on the computing complexity of moments themselves, we review efficient algorithms for moment calculation in a discrete space. Basically they can be categorized into two groups: decomposition methods and boundary-based methods. The former methods decompose the object into simple areas (squares, rectangles, rows, etc.) whose moments can be calculated easily in O(1) time. The object moment is then given as a sum of moments of all regions. The latter methods calculate object moments just from the boundary, employing Green's theorem or similar technique.

Applications

Numerous applications of moment invariants will be presented during the tutorial to illustrate the theoretical results. We show the performance of moment invariants in satellite image registration, face recognition, character recognition and in motion estimation, among others. Practical limitations such as robustness and discriminative power will be discussed too.