Advanced Concepts for Intelligent Vision Systems Aug. 31-Sept. 3, 2004 Brussels, Belgium http://acivs.org/acivs2004/

# Acivs 2004 Abstracts

This page is regenerated automatically every 60 minutes.

## Regular papers

### Paper 111: Invariant Recognition of Digital Figures using an Average and Median Distance Method''

A simple method to recognize digital figures using an average and median distance method'' is described. This method uses a feature vector sequence'' which allows shift, scaling and rotational invariant recognition. The feature vector sequence is computed for all distances between the points of the normalized figure and the vertexes of a regular m-sided polygon inscribed in a unit circle. So that, the feature vector sequence can be calculated in O(mn) operations via an n-point figure. To facilitate the feature vector sequence, we require only the pixel coordinates which make up the figures. Priori information concerning the order of the pixels of which the digital figure is composed is not required. Therefore, our method is applicable to non-singly connected figures. Experimental results show the usefulness of the proposed method.

### Paper 114: Rice Image Analysis by using Color, Texture and Shape

An industrial system to measure the performance properties of rice plants grown in a greenhouse setting has been devised. This paper discusses the algorithms developed to identify panicles in the images based on color, texture and shape analysis. The analysis results of several images of 1 plant are combined into a classification for the plant as having panicles or not. Results show that over 96% of the plants are classified correctly. The system has been deployed successfully and currently processes over 15.000 images a day. In the future we want to extract more information about the panicles such as their number and size.

### Paper 117: Detection of Earthquake-Damaged Areas from Low- And High- Altitude Aerial Images and a Digital Map

In this study, we propose a new method of automatic detection of areas damaged by an earthquake. For detecting the damaged areas rapidly over a wide area, aerial images are useful. In our method, we use the features of aerial images taken at two different altitudes for accurate detection of the damaged areas. One image is taken from a high altitude before the earthquake. The other image is taken from a low altitude after an earthquake. First, we register these two aerial images with a digital map. Next, we compute the differences in the registered pixels of these images, and we extract the damaged areas as those having large color differences of registered pixels. Our method enables accurate automatic detection of areas damaged by an earthquake.

### Paper 118: Multiresolutional Rigid-Body Registration of Space Curves

In this paper, we present a multiresolutional non-iterative rigid-body image registration technique, based on the geometrical invariance properties of space curves. A fundamental theorem in differential geometry states that a space curve is uniquely defined by its curvature k and torsion t. Therefore, transforming the space curves to (k,t)-space allows immediate correspondence, independent of rotation and translation. As a similarity measure, the mean squared difference (MSD) is used. After minimizing the MSD along the arc length of the curve, principal component analysis is applied to calculate the translation and the rotation parameters. Reparametrization of the space curves at different arc lengths results in a robust multiresolutional registration technique. This fully automatic technique inherently allows region of intrest (ROI) coregistration and is adequate to perform both global and local transformations.

### Paper 121: Efficient Frequency Design of Deformable Models

Active deformable models are tools very popular in computer vision and computer graphics for solving ill-posed problems and mimic real physical systems. The original formulation is supported on the spatial domain and the motor of the procedure remains being a second-order linear system with the rigidity and the elasticity the basic parameters for its characterization. This paper proposes a novel formulation of deformable models based on a frequency-domain analysis: the energy functional is first translated to the frequency domain, then the Lagrange minimization is applied directly on the new domain, resulting in a second-order differential equation with the Fourier transform of the snake as the main element. This new formulation leads to simpler procedures for solving the inverse problem, and yields also a novel implementation. The frequency-based implementation offers substantial computational savings in comparison to the original one, a feature that is even more emphasized by the efficient hardware and software computation of the FFT algorithm. The paper finally opens a short discussion about the possibility and convenience of designing deformable models in this new domain and apart from the elasticity and rigidity-based original schema.

### Paper 122: Wavelet based hyperspectral data analysis for vegetation stress classification

Remote sensing acquires information about the earth's surface by measuring radiance spectra. Traditionally, multispectral remote sensors acquired only a few wavelength bands. The study of vegetation was limited to vegetation indices, defined as specific ratios of bands. In recent years, hyperspectral sensors became available, allowing to sample the spectrum up to a few nanometer wavelength resolution. In this paper, we investigate the use of the complete hyperspectra for vegetation monitoring. In particular, we will study the problem of vegetation stress detection. We investigate the use of the wavelet transform for representing hyperspectra, and we will show that, combined with a proper feature selection and classifier, this representation is advantageous over the use of spectral bands directly. The experimental data sets contain leaf spectra of fruit trees from a test plot consisting of five different stress types.

### Paper 123: A Real-Time Facial Feature based Head Tracker

This paper presents a fast and efficient head tracking approach. Skin detection, gray-scale morphology and a geometrical face model are applied to detect roughly the face region and extract facial features automatically. A novel Kalman filtering framework is utilized for tracking and estimation of the 3-D pose of the moving head. An application to cursor control on a computer display is presented. Experiments with real image sequences show that the system is able to extract and track facial features reliably. Pose estimation accuracy was tested with synthetic data and good preliminary results were obtained. The real-time performance achieved indicates that the proposed system can be applied in platforms where computational resources are limited.

### Paper 125: Change detector combination based on fuzzy integrals in remotely sensed imagery

Combining multiple estimators is one of important research topics in classification and pattern recognition areas, which aims to achieve the best possible performance to the task at hand. In this paper, we investigate the applicability of the combination of several change detectors in remotely sensed data. Two change detection methods, based respectively on simultaneous and comparative analysis of multitemporal data, are developed using a fuzzy neural network architecture. Next, these change detectors are combined by using different forms of the fuzzy integral. The combination rules are evaluated in comparison with individual change detectors. Experimental results using SPOT images of the same area have shown that the performance of change detection can be significantly improved by using the proposed combination framework.

### Paper 130: Application of template-based metaprogramming compilation techniques to the efficient Implementation of Image Processing Algorithms on SIMD-Capable Processors

Microprocessors with SIMD enhanced instruction sets have been proposed as a solution for delivering higher hardware utilization for the most demanding media-processing applications. But direct exploitation of these SIMD extensions places a significant burden on the programmer since it generally involves a complex, low-level and awkward programming style.

In this paper, we propose a high-level C++ class library dedicated to the efficient exploitation of these SIMD extensions. We show that the weaknesses traditionally associated with library-based approaches can be avoided by resorting to sophisticated template-based metaprogramming techniques at the implementation level. Benchmarks for a prototype implementation of the library, targeting the PowerPC G4 processor and its Altivec extension are presented. They show that performances close to the one obtained with hand-crafted code can be reached from a formulation offering a significant increase in expressivity.

### Paper 131: A discrete choice pedestrian behavior model for pedestrian detection in visual tracking systems

Different approaches to the moving object detection in multi-object tracking systems use dynamic-based models. In this paper we propose the use of a discrete choice model (DCM) of pedestrian behavior and its application to the problem of the target detection in the particular case of pedestrian tracking. We analyze real scenarios assuming to have a calibrated monocular camera, allowing a unique correspondence between the image plane and the top view reconstruction of the scene. In our approach we first initialize a large number of hypothetical moving points on the top view plane and we track their corresponding projections on the image plane by means of a simple correlation method. The resulting displacement vectors are then re-projected on the top view and pre- filtered using distance and angular thresholds. The pre- filtered trajectories are the inputs for the discrete choice behavioral filter used to decide whether the pre-filtered targets are real pedestrians or not.

### Paper 132: Trajectories clustering in ICA space: an application to automatic counting of pedestrians in video sequences

In this paper we propose a method that can improve the automatic counting of pedestrians in video sequences for (automatic) video surveillance applications. We analyse the trajectory data set provided by a detection/tracking system. When using classical target detection and tracking systems, it is well known that the number of detected targets is overestimated/underestimated. A better representation for the trajectories is given in the ICA (Independent Component Analysis) transformed domain and clustering techniques are applied to the ICA-transformed data in order to provide a better estimation of the actual number of pedestrians which are present on the scene.

### Paper 133: Using Textural and Geometric Information for an Automatic Bridge Detection System

We present some results on systems for automatically detecting bridges in very high-resolution panchromatic satellite images using texture information and geometric models. The system has been tested on 2.5m per pixel and 1m per pixel aerial images processed to have the characteristics of SPOT 5 and Ikonos output. A system using simple geometric models gives good results for bridges over roads and railroads, and very bad results for bridges over larger regions such as rivers. In contrast, a system using a texture-based classification and hand-made rules applied to that classification gives good results for bridges over rivers and railroads, and bad results for bridges over roads.

### Paper 134: An Edge-based Method for Registering a Graph onto an Image with Application to Cadastre Registration

In the context of the development of a land use analysis system, we need to register a cadastre graph onto georeferenced color and near-infrared images at 50cm resolution. A registration process is necessary because image edges and cadastre edges do not correspond, since farmers need not strictly follow fiscal divisions. The problem of registering a cadastre graph onto an image is formalized as a graph matching problem, which is solved by simulated annealing. Additionally, a score for each cadastre edge is obtained, which shows which edges can be found in the image and which cannot. Minor geometrical deformations and acquisition errors can also be corrected.

### Paper 135: Remote sensing image classification enhancement using evidential reasoning based classifier combination

The work presented here addresses the problem of the enhancement of the remote sensing image classification. For that, an evidential reasoning based classifier combination method is proposed. The originality of the work lies in the fact that the method treats the outputs of the classifiers by completely ignoring the internal characteristics of the latter. The method is thus general and applicable to any type of classifier. It cumulates the advantages of each classifier without cumulating the disadvantages of them. It overcomes the disadvantages of the remote sensing image classification methods developed in the literature. Thus, it constitutes a powerful tool for several remote sensing image processing applications.

### Paper 136: Combining Classifiers in Rock Image Classification - Supervised and Unsupervised Approach

Combining classifiers has proved to be an effective solution to several classification problems in pattern recognition. In this paper we use classifier combination methods for the classification of natural images. In the image classification, it is often beneficial to consider each feature type separately, and combine the classification results in the final classifier. We present a classifier combination strategy that is based on classification result vector, CRV. It can be applied both in supervised and unsupervised way to the image classification. Natural images are often non-homogenous, which means that there are clearly visible changes in their visual properties. This makes them difficult to classify. In this paper we apply our classifier combination method to the classification of rock images. These images represent real rock image data that is non-homogenous in terms of its color and texture properties. The classification results are compared to the results of commonly used classifier combination strategies.

### Paper 137: Parameter Estimation In Pairwise Markov Fields

Hidden Markov fields (HMF), which are widely applied in various problems arising in image processing, have recently been generalized to Pairwise Markov Fields (PMF). Although the hidden process is no longer necessarily a Markov one in PMF models, they still allow one to recover it from observed data. We propose in this paper two original methods of pa-rameter estimation in PMF, based on general Stochastic Gra-dient (SG) and Iterative Conditional Estimation (ICE) princi-ples, respectively. Some experiments concerning unsupervised image segmentation based on Bayesian Maximum Posterior Mode (MPM) are also presented.

### Paper 138: Nonlinear multiscale decompositions by edge-adaptive subsampling

This paper describes an edge adaptive multiresolution decomposition, inspired by lifting techniques. Unlike existing adaptive lifting schemes, the proposed scheme does not (necessarily) adapts primal and dual lifting steps to the data, but rather makes the splitting (or subsampling) step data-dependent. The mechanism behind this data-dependent subsampling is so called \emph{normal offsets}: unlike a classical lifting scheme, the detail (or wavelet) coefficients do not encode the offset between observed value and coarse scale prediction at the same location. The method looks for a piercing point on the true surface in a normal direction from the coarse scale approximation. The method is highly nonlinear. This paper investigates the applicability of the approach for a multiscale decomposition of discrete data, such as digital images. The underlying philosophy starts from the idea that images cary information which has roughly spoken two components: the location of the edges (or main features) on one hand and the observed grey values in each location on the other hand. The objective is to capture the relative importance of features and grey values in one single multiscale analysis, i.e., the height of the edges and their positions are predicted and encoded together, and the method implicitly weighs their relative importance. It turns out, not surprisingly, that the nonlinear method performs well in encoding coarse scale edges, but less well on small scale details (such as texture). It is therefore interesting to incorporate the method into an even more adaptive scheme where details at fine scales are filled in with classical wavelet coefficients, i.e., vertical offsets.

### Paper 142: Information Retrieval From A Position-Varying Point Spread Function

We have been interested in the restoration of images degraded by atmospheric turbulence in which the resulting point spread function varies across the field of view. Such cases occur in wide-area astronomical imaging and in horizontal telescopic imaging close to the ground. Each image frame of a captured movie sequence is exposed for a time short enough to freeze the effects of the turbulence, resulting in a random wobble and blurring of the image that is position and time dependent. Registration of each frame to a reference image has been achieved previously by a region-of-interest cross correlation process and also by a gradient-based optical flow method. The resulting shift information is used to dewarp each frame of the sequence before averaging to provide a motion-blur corrected result. Further deblurring is carried out by blind deconvolution. This paper describes a new approach to registration that uses a region-of-interest Wiener filter to detect the local, space-varying point spread function or PSF. This new approach provides more robust shift information than cross correlation to describe the random wobble in the image sequence and also provides new information on the shape of the position-dependent blur PSF.

### Paper 143: Robust Thresholding Based on Wavelets and Thinning Algorithm for Degraded Camera Images

This paper describes a thresholding method for degraded documents acquired from a low-resolution camera. This technique is based on wavelet denoising and global thresholding for non-uniform illumination. The accent is put on stroke analysis to keep useful information more accurately without breaking characters and by reducing the number of overconnected ones. An improvement of the technique, which uses the thinning algorithm, is thus detailed and a particular attention is given to use this optimization in the convenient images only. Moreover, thanks to the wavelet decomposition, complex backgrounds as well as high frequency noise can be considered. This method can handle various corpus of images without need of any prior knowledge of the document image and fine-tuning of parameters.

### Paper 144: A Voting Strategy for Level-line Matching

A new level-line registration technique is proposed for image transform estimation. The approach is robust towards contrast changes, does not require any estimate of the unknown transfomation between images and tackles challenging situations that usually lead to pairing ambiguities, as repetitive patterns in images. The registration by itself is performed through an efficient level-line cumulative matching based on a multi-stage primitive election procedure. Each stage provides a coarse estimate of the transformation that the next stage gets to refine. The present paper deals with similarity transforms (compound of rotation, scale and translation), but the proposed approach extends to more complex transformations.

### Paper 146: Short Range Sensors for Intelligent Vehicles

Obstacle detection and classification in complex urban area are highly demanding, but desirable for protection of Vulnerable Road Users. This paper presents an in-vehicle stereovision-based sensor for short-range object detection. The basic principles have been given for designing the optical parameters of the sensor such as baseline, angular coverage, spatial resolution and dynamic range. A novel feature-indexed approach has been proposed to achieve fast and quality stereo matching. Consequently, the depth map was generated by reconstructing all image points into the world coordinates. Object segmentation based on the depth map made use of 3-dimentional information of the objects, and enabled a reliable and robust object detection capability.

### Paper 147: Simultaneous structure and texture compact representation

In this paper, we tackle the problem of image nonlinear approximation. During the last past years, many algorithms have been proposed to take advantage of the geometry of the image. We intend here to propose a new nonlinear approximation algorithm which would take into account the structures of the image, and which would be powerful even when the original image has some textured areas. To this end, we first split our image into two components, a first one containing the structures of the image, and a second one the oscillating patterns. We then perform the nonlinear approximation of each component separately. Our final approximated image is the sum of these two approximated components. This new nonlinear approximation algorithm outperforms the standard biorthogonal wavelets approximation.

### Paper 151: Discriminative Classification vs Modeling Methods in CBIR

Statistical learning methods are currently considered with an increasing interest in the content-based image retrieval (CBIR) community. We compare in this article two leader techniques for classification tasks. The first method uses one-class and two-class SVM to discriminate data. The second approach is based on Gaussian Mixture to model classes. To deal with the specificity of the CBIR classification task, adaptations have been proposed. Experimental tests on a generalist database have been carried out. Advantages and drawbacks are discussed for each method.

### Paper 154: A Robust Technique to Establish Correspondences in Case of Important Rotations Around the Optical Axis

This paper presents a technique to find out couples of identical corners detected in two consecutive images of a video sequence. These couples, called correspondences, are used to compose mosaics in order to obtain an image with bigger field of view and the same resolution. In our conditions, the camera can perform an important rotation around its optical axis between two acquisitions. Usually, the correlation or the block-matching is performed between the neighborhoods of corners of two consecutive images, in order to have a measure of similitude between the corners. Afterwards, the correspondences are established with the similitudes. If the rotation is important between the two images, the correlation or block-matching are not reliable because they depend strongly on the rotation of the neighborhoods. A technique to orient the neighborhoods of corners before the computation of the block-matching in order to compensate for the rotation effect is proposed in this article. The orientation is estimated by computing the main inertia axis. This method provides very encouraging results in correspondences establishment due to the precision of the estimation of the orientations. An important number of correct correspondences are in general obtained, that enables to construct mosaics.

### Paper 155: Recognition of Objects with Incomplete Representations

With inspiration from psychophysical researches of the human visual system we propose a novel method for performance evaluation of contour based shape recognition methods. We use the complete contour representaions of objects as the training set. The incomplete contour representaions of the same objects are used as test sets and the recogntion performance of two shape based methods is investigated. The methods, compared in this framework, use shape context and distance multiset as local shape descriptors. The performance of the methods is found to be very good if more than 30% and 3%, respectively, of the contour pixels are retained in the incomplete contour representations.

### Paper 156: Detection, Categorization And Recognition of Road Signs for Autonomous Navigation

In this paper we present a novel and robust approach for detection, categorization and recognition of road signs. It is known that the standard road signs contain few and easily distinguishable colors, such as red for prohibition, yellow for warnings, green, blue and white. We use a Bayesian approach for detecting road signs in the captured images based on their color information. At the same time, the results of the Bayes classifier categorize the detected road sign according to its color content. The SIFT transform is employed in order to extract a set of invariant features for the detected road sign label(s). Recognition is done by matching the extracted features with previously stored features of standard signs. We illustrate the accuracy and robustness of this approach.

### Paper 157: Enhanced Data Dependent Triangulation for Bayer Pattern Colour Interpolation

The Bayer pattern Colour Filter Array (CFA) is in use in digital cameras because it is a single channel array and therefore it drastically reduces the final cost of the acquisition device. The colour interpolation operator generates the missing RGB components of each pixel coming from the CFA. In this document an improved Colour Interpolation algorithm based on Data Dependent Triangulation is described. The differences among the described approach and the previous art techniques are synthesized as: an improvement of the triangulation cost function and an optimization of the triangulation construction itself. Both improvements achieve significant enhancements in terms of PSNR, respectively from 0.4 to 1 dB, and in terms of visual quality enrichment of the final interpolated RGB images. Moreover, a simplified antialiasing filter based on Freeman has been also proposed to remove the colour artifacts, obtaining a further improvement of about 5 dB on the average.

### Paper 158: Information Fusion using Covariance Intersection in the Frame of the Evidence Theory

This paper presents a multisensor and multitarget tracking architecture with objects eclipses. The architecture is composed of two sub-systems : a multitarget tracking for each sensor and a multitarget and multisensor fusion center. The multitarget tracking uses the Dempster-Shafer theory with proposed distributions of masses, that are functions of the distance between perceived objects and known objects, the sensor reliability and the perception uncertainty. The multisensor fusion is based on the Covariance Intersection algorithm.

### Paper 159: A fast convergent "chiseling" snake

Snakes, or active contours, are used extensively in computer vision and image processing applications, particularly to locate object boundaries. Problems associated with initialization and poor convergence to boundary concavities, however, have limited their utility. In this paper, we present a novel and fast convergent snake model based on the distance transform. Firstly, we find the points in the disconvergent area making use of the property of external forces in the concavity; Secondly, we modify the direction and magnitude of the external forces at these points so as to converge fast. Although GVF is the best method to solve initialization and boundary concavity, it has no idea for the problem of the deep concavity. Our algorithm is better than the GVF algorithm in convergent speed and the result for deep boundary concavities. And our model can ex-tend to snake model based on GVF also.

### Paper 160: Spatio-Temporal Signatures For Video Copy Detection

The number of copied videos is growing rapidly on television broadcast networks as well as on the world wide web. The existing copy detection methods resort either to image techniques or to video ones. We propose a spatio-temporal signature for the automatic detection of video extracts, based on the evolution of gray level centroids along time. We have obtained good results for a base of more than 50 Gb of data and for numerous tests of robustness. Our algorithm is robust to changes in contrast and brightness, zooms, modification of frame rate, a logo superimposition on the image, etc.

### Paper 161: Uncertainty of Affine Transformations in Digital Images

We introduce the notion of an uncertainty polytope for affine transformations. This polytope is used to captivate the uncertainty of affine transformations that arises when the position of the image points is not precisely known. We also derive secondary concepts such as the existence of redundant points and the definition of a similarity measure between distinct objects. Finally, an algorithm is proposed which looks for good correspondences between given sets of source and image points, and computes the corresponding transformation polytope. The algorithm and concepts that are introduced are illustrated with objects extracted from real images.

### Paper 162: Describing the image content for quality assessment of geospatial data

In this paper, we discuss the use of image information for quality assessment of geospatial data. The focus is on characterizing the quality and reliability of the image information to be able to make robust assessments. A methodology is proposed where, based on the image statistics of a road and its immediate surroundings, the performance of road detection can be predicted as well as the optimal parameter set which is needed for the detection of this type of road. The performance is characterized in terms of the detection rate and mean segment length. Experiments have been conducted on a set of images of typical roads in an IKONOS satellite image and verify the validity of the derived expressions.

### Paper 163: Topology Coding via Distance Function Based Reeb Graphs

Motivated by rapid developments in solid modelling, we present an invariant shape descriptor for topological coding of three dimensional objects. This novel approach encodes a three dimensional object into a Reeb graph using a normalized distance function. Unlike the height function which has been traditionally used to model topology, the proposed distance function based Reeb graph is invariant with respect to rotation, translation and scaling with low computational complexity. Simulation results demonstrate the potential of the proposed topological graph which may be used as a shape signature for object matching and reconstruction.

### Paper 165: Unsupervised Non Stationary Image Segmentation Using Triplet Markov Chains

This work deals with the unsupervised Bayesian hidden Markov chain restoration extended to the non stationary case. Unsupervised restoration based on Expectation-Maximization (EM) or Stochastic EM (SEM) estimates considering the Hidden Markov Chain(HMC) model is quite efficient when the hidden chain is stationary. However, when the latter is not stationary, the unsupervised restoration results can be poor, due to a bad match between the real and estimated models. In this paper we present a more appropriate model for non stationary HMC, via recent Triplet Markov Chains (TMC) model. Using TMC, we show that the classical restoration results can be significantly improved in the case of non stationary data. The latter improvement is performed in an unsupervised way using a SEM parameter estimation method. Some application examples to unsupervised image segmenta-tion are also provided.

### Paper 166: A Quality Assessment Metric Based on Perceptual HVS Behaviors

Human image quality assessment is intrinsically subjective. The Human Visual System (HVS), in fact, recognizes the objective content of an image and provides an interpretation of the input stimulus according to the observer's current state and mental schemes. Subjective tests imply the use of human observers and they are not very practical for evaluating the image quality. Hence, objective methods able to emulate the HVS better than the classical measures are required. We propose a full reference objective metric modeling two important perceptual phenomena: the contrast sensitivity and the masking effect. To this end, two terms are introduced, namely the Just Noticeable Factor (JNF) and the Similarity Factor (SF); they model the contrast sensitivity and the masking effect respectively. The JNF and the SF are combined to generate a perceptual distortion map (PDM) which is used to evaluate the final Perceptual Quality Score (PQS). Our subjective experiments confirm that the new metric is capable to generally predict a correct visual quality score when the PSNR fails.

### Paper 167: Comparative study of Stereo algorithms for 3D face reconstruction

This paper compares the efficiency of several stereo matching algorithms in reconstructing 3D faces from rectified stereo images. The stereo image acquisition system setup and the creation of a face disparity map benchmark image are detailed. Performance of the algorithms is measured by deviations of the reconstructed surfaces from a ground truth prototype. This latter is found by visual matching of corresponding nodes of a dense colour grid projected onto the faces. It is shown that by combining the most efficient but slow maximum- cut technique with fast dynamic programming, more accurate reconstruction results can be obtained.

### Paper 168: Speckle Filtering With Robust Anisotropic Diffusion

This paper deals with anisotropic diffusion in images affected by speckle. Two existing methods are reviewed. The first is a robust diffusion technique, non adapted to speckle, based on the Tukey function. The second applies anisotropic diffusion to speckle but lacks robustness. The contribution of this paper is to create a robust anisotropic diffusion filter adapted to speckle. The proposed approach is based on the two reviewed methods and introduces an original diffusion tensor. Experimentation results are presented and the performance of the three methods compared. The proposed algorithm shows significant enhancement.

### Paper 169: Incremental rectification of sports fields in video streams with application to soccer

We describe a complete system for image rectification in sport video sequences. Relying both on geometrical properties of the field elements and on photogrametric data, we compute an estimate of the field-to-image homography either incrementally -by inter-image data processing- or from scratch'' when direct image-model correspondences are possible. Examples are given in a soccer context.

### Paper 170: An Approach for Recognizing Stem-end/Calyx Regions in Apple Quality Sorting

In this paper we introduce a cascaded-classifier approach to localize stem-ends and calyxes of 'Jonagold' apples. First classifier (artificial neural network) extracts candidate objects, whereas the second one (nearest neighbor) discriminates stem-ends and calyxes from others. Overall system is tested by 616 fruits from which first classifier found 414 candidate objects. Several features are extracted from these objects and these features are then selected by forward selection method. With these selected features the second classifier reached to 80% recognition rate. Classification errors of each feature is also introduced for comparison.

### Paper 172: A global optimization scheme for mutual information based remote sensing image registration

Maximization of mutual information (MMI) has been applied successfully for remote sensing applications. To fully automate the registration process based on the MMI criterion, a robust global optimizer is indispensable. In this paper, we present a fully automated image registration system based on a hybrid global optimizer designed specifically for image registration problem. The proposed hybrid global optimizer consists of three major modules: a local optimizer, a global optimum test, and an initial search point generator. The global optimum test, which is the core of the proposed system, is a heuristic procedure to distinguish the global optimum of the MI registration function from the local ones. Based on this global optimum test, a simple robust yet efficient hybrid global optimizer can be designed to maximize the mutual information measure. Our experimental results show that, when the size of the search space is moderate, the proposed global optimizer is both robust and efficient. In the case that the search space is large, the proposed system still remains quite robust.

### Paper 173: Fundamental matrix estimation revisited through a global 3D reconstruction framework

This work presents a whole chain of 3D world reconstruction starting with point correspondences on two uncalibrated images taken from very different points of view. Five popular algorithms to estimate the fundamental matrix, selected from three groups (linear, iterative and robust) have been tested. After this simple self-calibration step, a triangulation algorithm is applied to reconstruct the scene. The evaluation of the estimation methods is realized by measuring the error between the reconstructed scene and the synthetic one. Finally, a new method to estimate the fundamental matrix that takes advantage of linear and robust methods is presented.

### Paper 175: Video summarization using fuzzy descriptors and a temporal segmentation

In this paper, three new compact and fuzzy descriptors (motion,color and orientation descriptors have respectively 3, 11 and 5 components) are introduced for video summary. A similarity measure is defined and allows frames to be compared according to several descriptors. The method of video summary is based on two stages: video segmentation and segment clustering. First, each video is partitioned within homogeneous segments from one or several descriptors. This segmentation is compared to the partition in shots and our approach retrieves the transitions with good precision (>90%). The segmentation by combining three descriptors provides better results than segmentation obtained with only one descriptor. Then, segment clustering with temporal constraint allows to reduce the size of summary with less key frames and to preserve a temporal coherence. Finally, research by example, tested on a corpus of 3 hours of video data, shows that the segments are correctly found and the index combination (motion, color and orientation) improves the results.

### Paper 176: Estimation of Landmarks with Occultation Using Statistical Models

In this paper, we deal with the pattern recognition problem using non-linear statistical models based on Kernel Principal Component Analysis. Objects that we try to recognize are defined by ordered sets of points. We present here two types of models: the first one uses an explicit projection function, the second one uses the Kernel trick. The present work attempts to estimate the localization of partially visible objects. Both are applied to the cephalometric problem with good results.

### Paper 177: Non parametric estimation of Dempster-Shafer belief functions

The application of the evidence theory to information fusion coming from different sources still poses certain problems. The estimate of belief function is a problem of paramount importance. Due to the coherence of this theory with the Bayesian approach, the belief functions can be represented by probabilities (a priori and a posteriori probabilities). In this paper, we propose a non parametric algorithm to estimate these belief functions. The non parametric algorithm is iterative and based on the maximum likelihood estimators. The non- parametric aspect comes from the use of the orthogonal probability density function (pdf) estimation, which is reduced to the estimation of the first Fourier coefficients of the pdf with respect to a given orthogonal basis. The interest of the proposed algorithm and its potential are studied starting from simple simulations.

### Paper 179: Towards Efficient Content Access To Jpeg Compessed Images

Existing methodology of accessing content of those compressed images is via decompression. In this paper, we report an analysis of DCT block decomposition scheme and propose an efficient algorithm to provide alternatives for content access of those compressed images. The proposed algorithm features in: (a) content access is made in a few different levels, and each level corresponds to different efficiency; (b) the quality of accessed images is scalable depending on the requirement of users; (c) compatibility is maintained with full decompression. Extensive experiments carried out support that the proposed content access maintains competitive image quality, while the computing cost is significantly lower than that of full decompression. In addition, experiments also support that the proposed algorithm preserves content well measured by tests on histogram-based image retrieval.

### Paper 181: Demosaicking as Selection from Multiple Candidates

Digital imaging devices typically use three spectral filters (e.g., red, green, and blue) to produce a color picture. Most commercially available devices use a color filter array and capture only one spectral component at a pixel location. This means that the camera must estimate the missing two color values at each pixel to produce the full-color image. This process is known as demosaicking. In this paper, we propose a demosaicking algorithm that consists of (i) a candidate generation'' step where several candidates are generated for each missing value using different algorithms, and (ii) a selection'' step where one of the candidates is chosen for each missing value. We tested the algorithm for a set of test images, and compared it with two standard demosaicking algorithms. The proposed algorithm produced better results both visually and in terms of mean square error.

### Paper 182: Automatic Selection Of A Region Of Interest in 3D Scene Images : Application to Video Captured Scenes

This paper introduces an original approach to automatically select a Region of Interest in an image that represents a 3D scene. We assume that the Region of Interest background is significant enough to be characterized by its color and its spatial coherence. We use these two features to provide such a selection that is the first step of a 2D to 3D registration process for analyzing video captured sport scenes. The whole project includes the 3D reconstruction of the scene (players, referees, ball) and its animation as a support for cognitive studies and strategy analysis.

### Paper 183: Reduction Of Ring Artifacts In High Resolution Micro-ct Images

Image reconstructions from Computer Tomography (CT) systems are often corrupted by ring artifacts caused by imperfections of detector elements. Since ring artifacts prohibit quantitative analysis and hamper posterior image processing steps, there is a need for reduction of such artifacts. Conventional ring artifact reduction schemes such as flat-field correction, are generally applied but often do not lead to satisfactory results. In this paper, a simple but efficient post-processing method is proposed that effectively reduces ring artifacts in reconstructed micro-CT images.

### Paper 185: Local Quality Assessment of 3-D Reconstructions from Sequence of Images: A Quantitative Approach

In this paper we address the problem of performance evaluation of 3-D reconstruction techniques from sequence of images. We propose a novel methodology for quality assessment of a given 3-D reconstruction. The method assumes the existence of 3-D ground truth data and a method for 3-D alignment of the given 3-D data and the ground truth data. The method divides the 3-D space into subspaces to investigate and localize regional errors in the given reconstructions. This feature facilitates the application of post-evaluation methods that involve diagnosis and data fusion. We employed new measures we call them the quality index and quality estimate for local and global quality assessment respectively. We applied the proposed methodology and measures to reconstructions generated by the space carving technique at different reconstruction resolutions and different input images. The experimental results show the ability of the proposed methodology and measures of quantifying the performance and assessing the quality of a given 3-D reconstruction technique.

### Paper 186: Scale Invariant Features for Camera Planning in a Mobile Trinocular Active Vision System

In this paper, we present a camera-planning approach for a mobile trinocular active vision system. At a stationary version of this system, the sensor planning module calculates the generalized cameras' parameters (i.e., translation distance from the center, zoom, focus and vergence) using deterministic geometric specifications of both the sensors and the objects in their field of view. Some of these geometric parameters are difficult to be predetermined for the mobile system operation. In this paper, a new camera- planning approach, based on processing the content of the captured images, is proposed. The approach uses a combination of a closed-form solution for the translation between the three cameras, the vergence angle of the cameras as well as zoom and focus setting with the results of the correspondences between the acquired images and a predefined target object(s) obtained using the SIFT algorithm. We demonstrate the accuracy of the new approach using practical experiments.

### Paper 188: Calibrating a network of cameras from live or archived video

We present an automatic approach for calibrating a network of cameras using live video captured from them. Our method requires video sequences containing moving people or objects but does not require any special calibration data. The silhouettes of these moving objects visible in a pair of views, are used to compute the epipolar geometry of that camera pair. The fundamental matrices computed by this method are used to first obtain a projective reconstruction of the complete camera configuration. Self calibration is then used to upgrade the projective reconstruction into a metric reconstruction. We have extended our approach to deal with unsynchronized video sequences captured at the same frame rate, by simultaneously recovering the epipolar geometry as well as the temporal offset between a pair of cameras. We use our approach to calibrate and synchronize a four camera system using archived video containing a moving person. Next, the silhouettes are used to construct the visual hull of the moving person using known Shape from Silhouette algorithms. Additional experiments on computing the fundamental matrix of two views from silhouettes are also performed.

### Paper 189: A robust Level Set approach for image segmentation and statistical modelling

In this paper, we propose a framework for extracting and modelling the region information in variational image segmentation. Considering a modelling of the image regions by a mixture of General Gaussian distributions, where each region is tracked by a deformable curve, an algorithm is proposed to initialize automatically the mixture and adapt its parameters to regions data during the segmentation. The objective of having adaptive mixture parameters is, firstly, to steer the region contours by accurate an representative region information and, secondly, to recover an accurate image mixture by the segmentation. We validate the approach on the segmentation of synthetic and real world color images.

### Paper 190: Retrieval of Vehicle Trajectories And Estimation of Lane Geometry Using Non-Stationary Traffic Surveillance Cameras

A tracking system is presented for obtaining accurate vehicle trajectories using uncalibrated traffic surveillance cameras. Techniques for indexing and retrieval of vehicle trajectories and estimation of lane geometry are also presented. An algorithm known as Predictive Trajectory Merge-and-Split (PTMS) is used to detect partial or complete occlusions during object motion. This hybrid algorithm is based on the constant acceleration Kalman filter and a set of simple heuristics for temporal analysis. The resulting vehicle trajectories are modeled using variable low-order polynomials. A comparative evaluation of several distance metrics used in trajectory cluster analysis, indexing and retrieval is also presented. We propose some changes to metrics presented in previous work and make a comparative study with a modified form of the Hausdorff distance. Some preliminary results are presented on the estimation of lane geometry through K-means clustering of individual vehicle trajectories using the proposed metrics. An advantage of our approach is that estimation of lane geometry can be performed with non-stationary, uncalibrated traffic cameras in real time.

### Paper 191: A photometric stereo shape reconstruction with robust uniform converting matrix

We present an improvement in both accuracy and stability of the method that reconstructs shape from photometric stereo. Our previous method applies the Jacobi iterative method to the reflectance map equation after expanding it linearly with respect to the three depth parameters, and the sparse matrix in the resulting iterative relation that converts the shading into depth has non- negative diagonal terms, assuring numerical stability for most cases. An extended numerical examination of the method, however, reveals that it is numerically instable or obtainable shapes are inaccurate depending on the lighting conditions. In this paper, the matrix is made uniform and robust so that all the diagonal terms have three non-negative terms for each of the lighting directions by neglecting part of the reconstruction region where the corresponding diagonal terms have one or two terms. Experimental results are given to show that the modification not only enables us to obtain shapes for such bad lighting conditions, but it also significantly improves the accuracy of the shape.

### Paper 192: Face detection in video-footages using audio information and differential image

We previously reported a face detection system with emphasis on color segmentation using HSV. It was shown that the color is more effective than other colors, such as RGB or YCbCr, not only in accurately segmenting faces but also in having facial features on the resulting binary images and that the system can be used either for detection or recognition. When it comes to video footages of news programs, for example, we observe that short video clips accompanied by some music are often inserted between different news reports, and that speaking persons often do not move in the same scenes. In this paper we analyze audio information to distinguish music scenes from news ones, so that we may be able to skip such insignificant scenes without going to video analysis. Specifically we detect wavelets that roughly correspond to syllables and compute their mean frequencies. Also in each news scene we average the difference of two images that are a certain number of frames apart from each other, so that we may be able to scan the sampling point only on significant points for the costlier segmentation processing. A few experimental results are given to show the usefulness of these methods.

### Paper 193: Combined Local-to-global Methods for Detecting a Single Colour Texture

This paper investigates the performance of detecting a single colour texture in complex colour images by using a distribution of distances between global and local texture features in the training and processed images. We compare three features: a colour histogram (CH), a colour cooccurrence histogram (CCH), and a coordinated clusters representation (CCR). For experiments, the training images are cut out from each original image and the ground truth'' for all complex images is manually built. Most of areas visually similar to the training texture are detected with all these features. But the CCR that has good performance of detecting true areas selects also large false background areas. Experiments show that the CCH has a lower level of such false selections. Thus a better solution for the one class detection can be obtained by combining the CCR and CCH features.

### Paper 194: Automatic Object Detection in Video Sequences with Camera in Motion

Automatic moving object detection/extraction has been explored extensively by the computer vision community. Unfortunately majority of the work has been limited to stationary cameras, in which background subtraction is utilized as the major methodology. In this paper, we will present a technique to tackle the problem in the case of moving camera which is the most often encountered situation in real life for target tracking, surveillance, etc. Instead of focusing on two adjacent time frames, our object detection rests on three consecutive video frames, a backward frame, the frame of interest and a forward frame. Firstly, optical flow based simultaneous iterative camera motion compensation and background estimation is carried out on backward and forward frames. Differences between camera motion compensated backward and forward frames with the frame of interest are then tested against the estimated background models for intensity change detection. Next, these change detection results are combined together for acquiring approximate shape of the moving object. Experimental results for a video sequence with moving camera are presented.

### Paper 198: Content-adaptive multiresolution analyses

In this paper we present a technique for building adaptive wavelets by means of an extension of the lifting scheme and analyze the stability of the resulting decompositions. Our scheme comprises an adaptive update lifting and a fixed prediction lifting step. The adaptivity consists hereof that the system can choose between two different update filters, and that this choice is triggered by the local gradient of the original signal. If the gradient is large (in some seminorm sense) it chooses one filter, if it is small the other. We derive necessary and sufficient conditions for the invertibility of such an adaptive system for various scenarios. Furthermore, we present some examples to illustrate our theoretical results. We also discuss the effects of quantization in such an adaptive wavelet decomposition and provide conditions for recovering the original decisions at the synthesis and for relating the reconstruction error to the quantization error. Such an analysis is essential for the application of these adaptive decompositions in image compression.

### Paper 199: Delay-Performance Trade-Offs in Motion-Compensated Scalable Subband Video Compression

Scalable video coding based on motion-compensated spatio-temporal (t+2D) wavelet decomposition is becoming increasingly popular, as it provides coding performance competitive with state-of-the-art codecs while accommodating varying network bandwidth and different receiver capabilities (e.g., frame-rate, display size, CPU, memory size). However, these temporal multiresolution schemes may introduce a non-negligible delay preventing their use by applications which require low latency or lightweight memory usage. In this paper, we provide a flexible approach to reduce the delay in motion-compensated temporal filtering schemes and illustrate the trade-offs between compression performance and low coding delay in this framework.

### Paper 202: Traffic Sign Classification Invariant to Rotations using Support Vector Machines

The aim of this paper is to obtain a new investigation technique on field of recognition and classification of traffic signs through artificial vision. The method consists of analyzing distances from edges of the shape to reference axis. The traffic sign classification is arranged by a Support Vectors Machine (SVM) which has been trained for circular, rectangular, triangular and octagonal forms. The process consists of two major steps: segmentation according to the color and identifying the geometric shape of the candidate blobs using SVMs. Before pattern recognition it is convenient to discard the blobs whose size is not properly. The most important advantages are its robustness against possible inclinations would have the traffic or highway sings, and on the other hand, its low computation load.

### Paper 204: Adaptive lifting schemes using variable-size block segmentation

In this paper, we are interested in adapting the operators of nonseparable lifting schemes in the context of image compression. Several couples of predictor and update operators can be used according to the local activity of the input signal. The adaptation procedure is a block-based one: the input image is segmented following a quadree and each segmented block is analyzed via a lifting using an appropriate couple of operators. Our contribution is two-fold. Firstly, we provide a quadtree segmentation rule. Secondly, we propose to design optimal operators which minimize the entropy of the resulting multiresolution representation thanks to a statistical modeling of the detail coefficients. Experimental results carried out on real images indicate that the proposed adaptive method yields substantial gain w.r.t. conventional lifting schemes.

### Paper 205: Face detection and recognition on a smart camera

There is a rapidly growing demand for using smart cameras for various biometric applications in surveillance. Although having a small form-factor, most of these applications demand huge processing performance for real-time processing. Face recognition is one of those applications. In this paper we show that we can run face recognition in real-time by implementing the algorithm on an architecture which combines a parallel pixel processor with a digital signal processor. The algorithm consists of a cascade of filters for detection, registration and normalization and an RBF neural network with temporal filtering. Everything fits within a digital camera, the size of a normal surveillance camera.

### Paper 207: Multiresolution multispectral image denoising based on probability of presence of features of interest

We propose a novel Bayesian multiresolution denoising method for multi-band images. The proposed method makes use of the inter-band correlations for estimating the probability of presence of features of interest in a noisy observation. The developed algorithm is of low computational complexity and its performance on color and on multispectral images is superior with respect to recent multi-band wavelet thresholding techniques.

### Paper 210: Estimation of Image Fractal Dimension Based on Empirical Mode Decomposition

In this paper, a renovate technique is presented to estimate fractal dimension of images based on Empirical Mode Decomposition (EMD). It has been demonstrated that EMD acts as a dyadic filter bank for 1-D fractional Brownian motion (fBm) and the quality of the obtained Intrinsic Mode Functions is related to the Hurst exponent. The 2-D case of this quality has not been discussed because extending 1-D EMD to 2-D case is not trivial. So in this article a new form of bidimensional EMD (NBEMD) is proposed. The spectrum and variance of the obtained Intrinsic Mode Functions (IMFs) are used to compute the Hurst exponents and fractal dimensions of textures. The experiment demonstrates the segmentation results of applying the estimated dimension to natural textures.

### Paper 211: A Fast Classification-Based Parameter Estimation Technique for Wavelet-Domain HMT Model

In this paper, we propose a fast classification-based parameter estimation technique for wavelet-domain HMT model. Although HMT model captures the statistics of intrascale and interscale of wavelet coefficients, model parameter training is complex and computationally expensive. To solve this problem, we propose a fast parameter estimation algorithm in which training is not needed. Firstly, each subband coefficients are classified by adaptive threshold. Secondly, the local statistics of different classes are computed, and HMT model parameters can be estimated by computed local statistics. Finally, we apply non-training HMT model to image denoising. Experimental results show that this fast parameter estimation algorithm can not only reduce computation complexity and diminish computation time cost, but also prove an improved denoising performance of PSNR and human vision beyond other methods.

### Paper 216: Video Synchronization Via Space-time Interest Point Distribution

We propose a novel algorithm to synchronize video recording the same scene from different viewpoints. Our method relies on correlating space-time interest point distribution in time between videos. Space-time interest points represent events in video that have high variations in both space and time. These events are unique in time and may pronounce themselves in videos from different viewpoints. We show that by detecting, selecting space-time interest points and correlating their distribution, videos from different views can be automatically synchronized.

### Paper 219: SPOT5 Images For Urbanization Detection

The aim of the ETATS project (Système d'Evaluation du Taux d'Actualisation de données topogéographiques par Télédétection Spatiale) is to estimate the degree of changes in the built-up area and in the communication network for the Belgian territory, using satellite images and the National Geographic Institute database. SPOT5 images have been used for this purpose. On the one hand the vectorial data are co-registered with the image to generate a mask representing the old status of the database. On the other hand, a classification process enables to separate the structured and textured areas from the rest of the image. The comparison between these two masks provides a coarse information on the zones of high urbanization changes.

### Paper 220: Prediction Design for Discrete Generalized Lifting

Lifting scheme is a useful tool to create different types of wavelets, including adaptive and non-linear decompositions. Generalized lifting scheme is more flexible and can improve lifting results, but the design of generalized prediction and update is a difficult task. This paper proposes a design strategy to optimize the prediction step, in the sense of minimizing detail signal energy, and shows some possible applications. Promising results are reported for certain classes of images.

### Paper 223: Three Specific Stages In Visual Perception For Vertebrate-type Dynamic Machine Vision

For efficient real-time visual perception in civilized natural environments (e.g. road networks) one has to take advantage of foveal ­ peripheral differentiation; this yields data economy, and active gaze control provides a number of benefits: 1. Iner- tial gaze stabilization considerably alleviates the evaluation of image sequences taken by cameras with stronger tele-lenses; it allows a reduction in angular disturbances from rough ground by at least an order of magnitude with simple negative angular rate feedback. 2. Visual tracking of fast moving objects re- duces motion blur for these objects. ­ 3. In the near range, a large field of view is mandatory, however, coarse angular resolution is sufficient; with a field of view (f.o.v.) > ~ 100°, both the region in front and to the side of the vehicle may be viewed simultaneously. For own behavior decision, motion behaviors of objects both in the wide f.o.v. nearby and in several regions of special interest further away have to be understood in conjunction. In order to achieve this efficiently, three distinct visual processes with specific knowledge bases have to be employed in a con- secutive way. Experimental results are given for the test vehi- cles VaMoRs (a 5-ton van) and VaMP (Mercedes 500 SEL).

This software generating these pages is © (not Ghent University), 2002-8053. All rights reserved.