Advanced Concepts for Intelligent Vision Systems

Organized by the SEE

September 28 - October 2, 2009

Mercure Chateau Chartrons, Bordeaux, France

http://acivs.org/acivs2009/

Benelux Signal
Processing Chapter

Acivs 2009 Abstracts

This page is regenerated automatically every 60 minutes.

Invited papers

Paper 102: Colour image processing by linear vector methods using projective geometric transformations

Author(s): Steve Sangwine

Development of processing methods for colour images has been a slow process, but over the last 10-12 years ideas have been steadily developed employing geometric operations in the space of the colour image pixels as generalisations of the fundamental scaling and spatial shift operations of linear greyscale image processing. The goal has been filters that are sensitive in some way to colour as well as spatial features. Establishing the set of possible fundamental geometric operations has been difficult, and in Euclidean space the set of linear operations is very limited. Recent developments however, have shown that there is a simple connection between geometric operations expressed using quaternion equations, and those expressed using homogeneous coordinates as 4×4 matrices. These operations include projective transformations as well as the classical Euclidean operations, and remarkably, they are all linear in homogeneous coordinates. This realisation greatly expands the possibilities of linear vector filtering methods, provided one operates in the domain of homogeneous coordinates and not in the original Euclidean colour space of the image pixels. The connections between quaternion equations and the matrices in homogeneous coordinates means that the insight gained by working in quaternion algebra can be applied to the design of algorithms implemented using more classical matrix methods. The talk will illustrate the use of simple quaternion equations to express geometric operations, and show how the use of homogeneous coordinates makes it possible to work with quite difficult geometric transformations fairly easily. The possibilities for using projective and other transformations for devising new types of colour image filter will be shown and illustrated.

Paper 106: Multi-modal similarity measures for change detection and image registration. Theory overview and perspectives for high and very high resolution remote sensing data

Author(s): Jordi Inglada

Similarity measures is a generic name for cost functions which are usually used in image processing for the tasks of registration and change detection.

Similarity measures as Mutual Information are becoming a classical choice for the problem of comparing images acquired in a multi-sensor context. They are a good choice when the link between the two sets of data to be compared is not known or can not be simply modeled. However, since these measures are based on the estimation of probability densities, they are difficult to implement when the size of the expected changes is small. Also, when the shape of the changes is not suited to a classical rectangular estimation window, the computation of the similarity may be difficult.

With the arrival of new high and very high resolution imaging sensors, the heterogeneity inside the information classes and the ability to distinguish small objects of interest has to be accounted for.

In this tutorial, after an introduction to classical multi-modal similarity measures, several new challenges related to the processing of high and very high resolution image data will be addressed and some insights of interesting research problems will be presented.

A special emphasis will be put on the object-based image analysis approaches which are unavoidable in the context of high resolution remote sensing images. Practical examples and demonstrations will be shown using the ORFEO Toolbox open source software http://www.orfeo-toolbox.org.

Regular papers

Paper 108: A 3D Statistical Facial Feature Model and Its Application on Locating Facial Landmarks

Author(s): Xi Zhao, Emmanuel Dellandréa, liming Chen

3D face landmarking aims at automatic localization of 3D facial features and has a wide range of applications, including face recog- nition, face tracking, facial expression analysis. Methods so far developed for 2D images were shown sensitive to lighting condition changes. In this paper, we propose a learning-based approach for reliable locating of face landmarks in 3D. Our approach relies on a statistical model, called 3D Statistical Facial feAture Model(SFAM) in the paper, which learns both global variations in 3D face morphology and local ones around the 3D face landmarks in terms of local texture and shape. Experimented on FRGC v1.0 dataset, our approach shows its eectiveness and achieves 99.09% of locating accuracy in 10mm precision. The mean error and standard deviation of each landmark are respectively less than 5mm and 4.

Paper 111: Combination of Attributes in Stereovision Matching for Fish-Eye Lenses in Forest Analysis

Author(s): P. Javier Herrera, Gonzalo Pajares, María Guijarro, J. Jaime Ruz, Jesús M. de la Cruz

This paper describes a novel stereovision matching approach by combining several attributes at the pixel level for omni-directional images obtained with fish-eye lenses in forest environments. The goal is to obtain a disparity map as a previous step for determining distances to the trees and then the volume of wood in the imaged area. The interest is focused on the trunks of the trees. Because of the irregular distribution of the trunks, the most suitable features are the pixels. A set of six attributes is used for establishing the matching between the pixels in both images of the stereo pair. The final decision about the matched pixels is taken by combining the attributes. Two combined strategies are proposed: the Sugeno Fuzzy Integral and the Dempster-Shafer theory. The combined strategies, applied to our specific stereo vision matching problem, make the main finding of the paper. In both, the combination is based on the application of three well known matching constraints. The proposed approaches are compared among them and favourably against the usage of simple features.

Paper 114: Quality Fusion Rule for Face Recognition in Video

Author(s): Chao Wang, Yongping Li, Xinyu Ao

Face recognition in video is confronted with many problems: varying illumination, pose and expression. Their compensation algorithms may produce much noise and make face abnormal, which degrade the face image quality. In this paper, motivated by human cognitive process, a quality fusion rule is designed to reduce the influence of compensated face image quality that may affect recognition performance. Combined with video features and the recognition contribution degrees of compensated face image, the rule fuses the recognition result of every face video frame to opt best result. In this paper, quality fusion rule for illumination compensation is mainly involved. In the experiment, the proposed quality fusion rule is evaluated on a face video database with varied illumination. In contrast to other state-of-the-art methods, the novel approach has better recognition performance.

Paper 116: Engineering of Computer Vision Algorithms Using Evolutionary Algorithms

Author(s): Marc Ebner

Computer vision algorithms are currently developed by looking up the available operators from the literature and then arranging those operators such that the desired task is performed. This is often a tedious process which also involves testing the algorithm with different lighting conditions or at different sites. We have developed a system for the automatic generation of computer vision algorithms at interactive frame rates using GPU accelerated image processing. The user simply tells the system which object should be detected in an image sequence. Simulated evolution, in particular Genetic Programming, is used to automatically generate and test alternative computer vision algorithms. Only the best algorithms survive and eventually provide a solution to the user's image processing task.

Paper 119: Real-Time Center Detection of an OLED Structure

Author(s): Roel Pieters, Pieter Jonker, Henk Nijmeijer

The research presented in this paper focuses on real-time image processing for visual servoing, i.e. the positioning of a x-y table by using a camera only instead of encoders. A camera image stream plus real-time image processing determines the position in the next iteration of the table controller. With a frame rate of 1000 fps, a maximum processing time of only 1 millisecond is allowed for each image of 80x80 pixels. This visual servoing task is performed on an OLED (Organic Light Emitting Diode) substrate that can be found in displays, with a typical size of 100 by 200 micrometer. The presented algorithm detects the center of an OLED well with sub-pixel accuracy (1 pixel equals 4 micrometer, sub-pixel accuracy reliable up to +/- 1 micrometer) and a computation time less than 1 millisecond.

Paper 120: Image Quality Assessment Based on Edge-region Information and Distorted Pixel for JPEG and JPEG2000

Author(s): Zianou Ahmed seghir, Fella Hachouf

The main objective of image quality assessment metrics is to provide an automatic and efficient system to evaluate visual quality. It is imperative that these measures exhibit good correlation with perception by the human visual system (HVS). This paper proposes a new algorithm for image quality assessment,which supplies more flexibility than previous methods in using the distorted pixel in the assessment. First, the distorted and original images are divided into blocks of 11×11 pixels, and secondly, we calculate distorted pixels then visual regions of interest and edge information are computed which can be used to compute the global error. Experimental comparisons demonstrate the effectiveness of the proposed method.

Paper 121: Fast Multi Frames Selection Algorithm Based on Macroblock Reference Map for H.264/AVC

Author(s): Kyung-Hee Lee, Jae-Won Suh

The variable block size motion estimation (ME) and compensation (MC) using multiple reference frames is adopted in H.264/AVC to improve coding efficiency. However, the computational complexity for ME/MC increases proportional to the number of reference frames. In this paper, we propose a new efficient reference frame selection algorithm to reduce the complexity. The proposed algorithm selects suitable reference frames by employing the spatial and temporal correlation of video sequence. The experimental results show that the proposed algorithm decreases video encoding time while keeping the similar visual quality and bit rates.

Paper 122: Bayesian Pressure Snake for Weld Defect Detection

Author(s): Aicha Baya Goumeidane, Mohammed Khamadja, Nafaa Nacereddine

Image segmentation plays a key role in automatic weld defect detection and classification in radiographic testing.Among the segmentation methods, boundary extraction based on deformable models is a powerful technique to describe the shape and then deduce after the analysis stage, the type of the defect under investigation. This paper describes a method for automatic estimation of the contours of weld defect in radiographic images. The method uses a statistical formulation of contour estimation by exploiting statistical pressure snake based on non-parametric modeling of the image. Here the edge energy is replaced by a region energy which is a function of statistical characteristics of area of interest.

Paper 123: Behavioral State Detection of Newborns Based on Facial Expression Analysis

Author(s): Lykele Hazelhoff, Jungong Han, Sidarto Bambang-Oetomo, Peter de With

Prematurely born infants are observed at a Neonatal Intensive Care Unit (NICU) for medical treatment. Whereas vital body functions are continuously monitored, their incubator is covered by a blanket for medical reasons. This prevents visual observation of the newborns during most time of the day, while it is known that the facial expression can give valuable information about the presence of discomfort.

This prompted the authors to develop a prototype of an automated video survey system for the detection of discomfort in newborn babies by analysis of their facial expression. Since only a reliable and situation-independent system is useful, we focus at robustness against non-ideal viewpoints and lighting conditions. Our proposed algorithm automatically segments the face from the background and localizes the eye, eyebrow and mouth regions. Based upon measurements in these regions, a hierarchical classifier is employed to discriminate between the behavioral states sleep, awake and cry.

We have evaluated the described prototype system on recordings of three healthy newborns, and we show that our algorithm operates with approximately 95% accuracy. Small changes in viewpoint and lighting conditions are allowed, but when there is a major reduction in light, or when the viewpoint is far from frontal, the algorithm fails.

Paper 124: Tracking 3D Orientation Through Corresponding Conics

Author(s): Alberto Alzati, Marina Bertolini, N.Alberto Borghese, Cristina Turrini

We propose here a new method to recover the 3D orientation of a rigid body by matching corresponding conics embedded in the object itself. The method is based on writing the projective equations of the conics and rearranging them in a suitable way. This leads to a very simple linear system. Results from simulated experiments show good accuracy and suggest that this method could be used for instance in augmented reality surgery to effectively track surgery instruments inside the operating room.

Paper 125: Unsupervised Detection of Gradual Video Shot Changes with Motion-Based False Alarm Removal

Author(s): Ralph Ewerth, Bernd Freisleben

The temporal segmentation of a video into shots is a fundamental prerequisite for video retrieval. There are two types of shot boundaries: abrupt shot changes ("cuts") and gradual transitions. Several high-quality algorithms have been proposed for detecting cuts, but the successful detection of gradual transitions remains a surprisingly difficult problem in practice. In this paper, we present an unsupervised approach for detecting gradual transitions. It has several advantages. First, in contrast to alternative approaches, no training stage and hence no training data are required. Second, no thresholds are needed, since the used clustering approach separates classes of gradual transitions and non-transitions automatically and adaptively for each video. Third, it is a generic approach that does not employ a specialized detector for each transition type. Finally, the issue of removing false alarms caused by camera motion is addressed: in contrast to related approaches, it is not only based on low-level features, but on the results of an appropriate algorithm for camera motion estimation. Experimental results show that the proposed approach achieves very good performance on TRECVID shot boundary test data.

Paper 127: VISRET - A Content Based Annotation, Retrieval and Visualization Toolchain

Author(s): Levente Kovács, Ákos Utasi, Tamás Szirányi

This paper presents a system for content-based video retrieval, with a complete toolchain for annotation, indexing, retrieval and visualization of imported data. The system contains around 20 feature descriptors, a modular infrastructure for descriptor addition and indexing, a web-based search interface and an easy-to-use query-annotation-result visualization module. The features that make this system differ from others is the support of all the steps of the retrieval chain, the modular support for standard MPEG-7 and custom descriptors, and the easy-to-use tools for query formulation and retrieval visualization. The intended use cases of the system are content- and annotation-based retrieval applications, ranging from community video portals to indexing of image, video, judicial, and other multimedia databases.

Paper 130: Relational Dynamic Bayesian Networks to improve Multi-Target Tracking

Author(s): Cristina Manfredotti, Enza Messina

Tracking relations between moving objects is a big challenge for Computer Vision research. Relations can be useful to better understand the behaviors of the targets, and the prediction of trajectories can become more accurate. Moreover, they can be useful in a variety of situations like monitoring terrorist activities, anomaly detection, sport coaching, etc. In this paper we propose a model based on Relational Dynamic Bayesian Networks (RDBNs), that uses first-order logic to model particular correlations between objects behaviors, and show that the performance of the prediction increases significantly. In our experiments we consider the problem of multi-target tracking on a highway where the behavior of targets is often correlated to the behavior of the targets near to them. We compare the performance of a Particle Filter that does not take into account relations between objects and the performance of a Particle Filter that makes inference over the proposed RDBN. We show that our method can follow the targets path more closely than the standard methods, being able to better predict their behaviors while decreasing the complexity of the tracker task.

Paper 131: A New Method for Segmentation of Images Represented in a HSV Color Space

Author(s): Dumitru Dan Burdescu, Marius Brezovan, Eugen Ganea, Liana Stanescu

This paper presents an original low-level system for color image segmentation considering the Hue-Saturation-Value (HSV) color space. Many difficulties of color image segmentation may be resolved using the correct color space in order to increase the effectiveness of color components to discriminate color data. The technique proposed in the article uses new data structures that lead to simpler and more efficient segmentation algorithms. We introduce a flexible hexagonal network structure on the pixels image and we extract for each segmented region the syntactic features that can be used in the shape recognition process. Our technique has a time complexity lower than the methods studied from specialized literature and the experimental results on Berkeley Segmentation Dataset color image database show that the performance of method is robust.

Paper 132: Carotenoid Concentration of Arctic Charr (Salvelinus alpinus L.) from Spectral Data

Author(s): J. Birgitta Martinkauppi, Jukka Kekäläinen, Yevgeniya Shatilova, Jussi Parkkinen

The most striking feature of Arctic Charr (Salvelinus alpinus L.) is the red abdomen area during the mating season. This colouration is assumed to be related to the vitality, nutritional status, foraging ability and generally health of the fish – an important knowledge to fisheries and researchers. The colouration should be assessed numerically, and the amount of pigment (carotenoid) causing the colour should be known for quality evaluation of the fish. Especially the carotenoid amount is thought to be directly connected to the investment of the individual since carotenoids are energetically costly. To assess this amount, we investigate the relationship between chemical and spectral data. We also tested a simple model for approximating carotenoid content from spectral measurements. The preliminary results indicate a reasonable good correlation between these two data.

Paper 133: Highlight Removal from Single Image

Author(s): Pesal Koirala, Markku Hauta-Kasari, Jussi Parkkinen

The highlight removal method from the single image without knowing the illuminant has been presented. The presented method is based on the Principal Component Analysis (PCA), Histogram equalization and Second order polynomial transformation. The proposed method does not need color segmentation and normalization of image by illuminant. The method has been tested on different types of images, images with or without texture and images taken in different unknown light environment. The result shows the feasibility of the method. Implementation of the method is straight forward and computationally fast.

Paper 134: Shape Recognition by Voting on Fast Marching Iterations

Author(s): Abdulkerim Çapar, Muhittin Gökmen

In this study, we present a Fast Marching (FM) - Shape Description integrated methodology that is capable both extracting object boundaries and recognizing shapes. A new speed formula is represented, and the local front stopping algorithm in [1] is enhanced to freeze the active contour near real object boundaries. GBSD [2] is utilized as shape descriptor on evolving contour. Shape description process starts when a certain portion of the contour is stopped and continues with FM iterations. Shape description at each iteration is threaded as a different source of shape information and they are fused to get better recognition results. This approach removes the limitation of traditional recognition systems that have only one chance for shape classification. Test results shown in this study prove that the voted decision result among these iterated contours outperforms the ordinary individual shape recognizers.

Paper 135: Unusual Activity Recognition in Noisy Environments

Author(s): Matti Matilainen, Mark Barnard, Olli Silvén

In this paper we present a method for unusual activity recognition that is used in home environment monitoring. Monitoring systems are needed in elderly persons homes to generate automatic alarms in case of emergency. The unusual activity recognition method presented here is based on a body part segmentation algorithm that gives an estimation of how similar the current pose is compared to the poses in the training data. As there are arbitrary number of possible unusual activities it is impossible to train a system to recognize every unusual activity. We train our system to recognize a set of normal poses and consider everything else unusual. Normal activities in our case are walking and sitting down.

Paper 137: Person's Recognition Using Palmprint Based On 2D Gabor Filter Response

Author(s): Abdallah Meraoumia, Salim Chitroub, Mohamed Saigaa

Palmprint recognition is very important in automatic personal identification. The objective of this study is to develop an efficient prototype system for an automatic personal identification using palmprint technology. In this work, a new texture feature based on Gabor filter is proposed. First, the region of interest was filtering by 2D Gabor filter, then, the principal lines, wrinkles, and ridges are extracted using a simple thresholding. Latterly, the candidate was found by matching process. We have tested our algorithm scheme over several images taken from a palmprint database collected by Hong Kong Polytechnic University. The obtained results showed that the designed system achieves an acceptable level of performance.

Paper 139: Estimating Color Signal at Different Correlated Color Temperature of Daylight

Author(s): Paras Pant, Pesal Koirala, Markku Hauta-Kasari, Jussi Parkkinen

Color signal changes with change in illuminant information. This study focuses on estimating color signals at different Correlated Color Temperature (CCT) of daylight. We selected a set of color signals at different CCT of daylight for estimation. An experiment was conducted by generating color signals from 24 color samples of Macbeth Color Checker and 1645 daylight spectral power distributions (SPD), where CCT ranges from 3757K to 28322K. By uniform sampling of this, we collected 84 color signals from each color samples and combined them to form a training dataset. Principal Component Analysis (PCA) has been applied on the selected training dataset to find the basis vectors and the number of color signals needed for estimation. We apply the Wiener estimation with different order of polynomials to estimate the color signal of color samples. Interestingly, good estimation of all 1645 color signals of given color sample from Macbeth color chart is obtained by selecting five best CCT color signals of that given color sample and with association to its third order polynomial. However, the results from high order polynomials yield to significant errors on Wiener estimation.

Paper 143: Parallel Region-Based Level Set Method with Displacement Correction for Tracking a Single Moving Object

Author(s): Xianfeng Fei, Yasunobu Igarashi, Koichi Hashimoto

We proposed a parallel level set method with displacement correction (DC) to solve collision problems during tracking a single moving object. The major collision scenarios are that the target cell collides with other cells, air bubbles, or a wall of the water pool where cells swim. These collisions result in detected contour of the target spreading to the other obstacles which induces target missing and tracking failure. To overcome this problem, we add displacement correction to the procedure of boundary detection once the collision occurs. The intensity summation of inside detected contour is utilized to determine whether collision occurs. After the collision is detected, we translate the current level set function according to the displacement information of target cell. To clarify the ability of our proposed method, we try cell (paramecium) tracking by visual feedback controlling to keep target cell at the center of a view field under a microscope. To reduce computational time, we implement our proposed method in a column parallel vision (CPV) system. We experimentally show that the combination of our proposed method and CPV system can detect the boundary of the target cell within about 2 [ms] for each frame and robustly track cell even when the collision occurs.

Paper 145: Comparing Feature Matching for Object Categorization in Video Surveillance

Author(s): Rob Wijnhoven, Peter de With

In this paper we consider an object categorization system using local HMAX features. Two feature matching techniques are compared: the MAX technique, originally proposed in the HMAX framework, and the histogram technique originating from Bag-of-Words literature. We have found that each of these techniques have their own field of operation. The histogram technique clearly outperforms the MAX technique with 5-15% for small dictionaries up to 500-1,000 features, favoring this technique for embedded (surveillance) applications. Additionally, we have evaluated the influence of interest point operators in the system. A first experiment analyzes the effect of dictionary creation and has showed that random dictionaries outperform dictionaries created from Hessian-Laplace points. Secondly, the effect of operators in the dictionary matching stage has been evaluated. Processing all image points outperforms the point selection from the Hessian-Laplace operator.

Paper 146: Kolmogorov Superposition Theorem and Wavelet Decomposition for Image Compression

Author(s): Pierre-Emmanuel Leni, Yohan Fougerolle, Frédéric Truchetet

Kolmogorov Superposition Theorem (KST) stands that any multivariate function can be decomposed into two types of monovariate functions that are called inner and external functions. One inner function is associated to each dimension. Inner functions are combined to build a hash-function that associates every point of a multidimensional space to a value of the real interval [0,1]. These intermediate values are then associated by external functions to the corresponding value of the multidimensional function.

We present in this paper a technique to adapt the inner and external functions to achieve image compression. More precisely, we propose a new algorithm to decompose images (discrete data) into sums and compositions of monovariate functions (continuous representation), and we present a compression approach based on the approximation and simplification of this functional representation of an image. Indeed, due to the decomposition scheme, the quantity of information required to build the monovariate functions can be adapted. This implies that only a fraction of the pixels of the original image has to be contained in the network used to build the correspondence between the monovariate functions. Furthermore, to improve the reconstruction quality of our compression technique, we combine KST and multiresolution approach, where the low frequencies will be represented with the highest accuracy, and the high frequency representation will benefit from the adaptive aspect of our method.

Our main contributions are the adaptation of the approximating algorithm proposed by Igelnik for the representation of images, and the combination of KST approximation and multiresolution for a new compression technique. We present our results for various images and compare our approach to standard techniques, illustrated by curves representing the PSNR with respect to the percentage of pixels used to build the internal and external functions.

Paper 148: Image Categorization Using ESFS: A New Embedded Feature Selection Method Based on SFS

Author(s): Huanzhang Fu, Zhongzhe Xiao, Emmanuel Dellandréa, Weibei Dou, Liming Chen

Feature subset selection is an important subject when training classifiers in Machine Learning (ML) problems. Too many input features in a ML problem may lead to the so-called "curse of dimensionality", which describes the fact that the complexity of the classifier parameters adjustment during training increases exponentially with the number of features. Thus, ML algorithms are known to suffer from important decrease of the prediction accuracy when faced with many features that are not necessary. In this paper, we introduce a novel embedded feature selection method, called ESFS, which is inspired from the wrapper method SFS since it relies on the simple principle to add incrementally most relevant features. Its originality concerns the use of mass functions from the evidence theory that allows to merge elegantly the information carried by features, in an embedded way, and so leading to a lower computational cost than original SFS. This approach has successfully been applied to the domain of image categorization and has shown its effectiveness through the comparison with other feature selection methods.

Paper 149: Pattern Analysis for an Automatic and Low-cost 3D Face Acquisition Technique

Author(s): Karima Ouji, Mohsen Ardabilian, Liming Chen, Faouzi Ghorbel

This paper proposes an automatic 3D face modeling and localizing technique, based on active stereovision. In the offline stage, the optical and geometrical parameters of the stereosensor are estimated. In the online acquisition stage, alternate complementary patterns are successively projected. The captured right and left images are separately analyzed in order to localize left and right primitives with sub-pixel precision. This analysis also provides us with an efficient segmentation of the informative facial region. Epipolar geometry transforms a stereo matching problem into a one-dimensional search problem. Indeed, we employ an adapted, optimized dynamic programming algorithm to pairs of primitives which are already located in each epiline. 3D geometry is retrieved by computing the intersection of optical rays coming from the pair of matched features. A pipeline of geometric modeling techniques is applied to densify the obtained 3D point cloud, and to mesh and texturize the 3D final face model. An appropriate evaluation strategy is proposed and experimental results are provided.

Paper 153: Convex Hull-based Feature Selection in Application to Classification of Wireless Capsule Endoscopic Images

Author(s): Piotr Szczypinski, Artur Klepaczko

In this paper we propose and examine a Vector Supported Convex Hull method for feature subset selection. Within feature subspaces, the method checks locations of vectors belonging to one class with respect to the convex hull of vectors belonging to the other class. Based on such analysis a coefficient is proposed for evaluation of subspace discrimination ability. The method allows for finding subspaces in which vectors of one class cluster and they are surrounded by vectors of the other class. The method is applied for selection of color and texture descriptors of capsule endoscope images. The study aims at finding a small set of descriptors for detection of pathological changes in the gastrointestinal tract. The results obtained by means of the Vector Supported Convex Hull are compared with results produced by a Support Vector Machine with the radial basis function kernel.

Paper 155: Parallel Blob Extraction Using The Multi-core Cell Processor

Author(s): Praveen Kumar, Kannappan Palaniappan, Ankush Mittal, Guna Seetharaman

Rapid increase in pixel density and frame rates of modern imaging sensors have produced an accelerated demand for fine-grained and embedded parallelization strategies in real-time implementations of video processing algorithms. Multicore architectures like Cell Broadband Engine (BE) provide an appealing platform for accelerating multimedia processing. However, the potential benefits of these multicore processors can only be harnessed efficiently by developing novel parallelization strategies and algorithmic modification. This paper describes parallel implementation of video object detection algorithms like flux tensor motion estimation, morphological operations and Connected Component Labeling (CCL) optimized for execution on Cell BE. Novel parallelization and explicit instruction level optimization techniques are described for fully exploiting the computational capacity of the Synergistic Processing Elements (SPEs) on the Cell processor. Experimental results show significant speedups ranging from a factor of nearly 300 for binary morphology to a factor of nearly 48 for flux tensor and 8 for CCL in comparison to equivalent sequential implementations applied to High Definition (HD) video.

Paper 156: Intelligent Vision: A First Step - Real Time Stereovision

Author(s): John Morris, Khurram Jawed, Georgy Gimel'farb

We describe a real time stereo vision system capable of processing high resolution (1Mpixel or more) images at 30 fps with disparity ranges of 100 or more. This system comprises a fast rectification module associated with each camera which uses a look up table approach to remove lens distortion and correct camera misalignment in a single step. The corrected, aligned images are passed through a module which generates disparity and occlusion maps with a latency of two camera scan lne intervals. This module implements a version of the Symmetric Dynamic Programming Stereo (SDPS) algorithm which has a small, compact hardware realization, permitting many copies to be instantiated to accommodate large disparity ranges. Snapshots from videos taken in our laboratory demonstrate that the system can produce effective depth maps in real time. The occlusion maps that the SDPS algorithm produces clearly outline distinct objects in scenes and present a powerful tool for segmenting scenes rapidly into objects of interest.

Keywords: Stereovision, real time, dynamic programming

Paper 157: Level Set-Based Fast Multi-Phase Graph Partitioning Active Contours Using Constant Memory

Author(s): Filiz Bunyak, Kannappan Palaniappan

We present multi-phase FastGPAC that extends our dramatic improvement of memory requirements and computational complexity on two-class GPAC, into multi-class image segmentation. Graph partitioning active contours GPAC is a recently introduced approach that elegantly embeds the graph-based image segmentation problem within a continuous level set-based active contour paradigm. However, GPAC similar to many other graph-based approaches has quadratic memory requirements. For example, a 1024x1024 grayscale image requires over one terabyte of working memory. Approximations of GPAC reduce this complexity by trading off accuracy. Our FastGPAC approach implements an exact GPAC segmentation using constant memory requirement of few kilobytes and enables use of GPAC on high throughput and high resolution images. Extension to multi-phase enables segmention of multiple regions of interest with different appearances. We have successfully applied FastGPAC on different types of images, particularly on biomedical images of different modalities. Experiments on the various image types, natural, biomedical etc. show promising segmentation results with substantially reduced computational requirements.

Paper 159: Rapid Detection of Many Object Instances

Author(s): Suwan Tongphu, Naddao Thongsak, Matthew Dailey

We describe an algorithm capable of detecting multiple object instances within a scene in the presence of changes in object scale and orientation. Our approach consists of first calculating frequency vectors for discrete feature vector clusters (visual words) within a sliding window as a representation of the image patch. We then classify each patch using an AdaBoost cascade whose weak classifier simply applies a threshold to one visual word's frequency within the patch. Compared to previous work, our algorithm is simpler yet performs remarkably well on scenes containing many object instances. The method requires relatively few training examples and consumes 2.2 seconds on commodity hardware to process an image of size 640×480. In a test on a challenging car detection problem using a relatively small training set, our implementation dramatically outperforms the detection performance of a standard AdaBoost cascade using Haar-like features.

Paper 160: Radar Imager for Perception and Mapping in Outdoor Environnements

Author(s): Raphaël Rouveure, Patrice Faure, Marie-Odile Monod

Perception remains a challenge in outdoor environments. Overcoming the limitations of vision-based sensors, microwave radar presents considerable potential. Such a sensor so-called K2Pi has been designed for environment mapping. In order to build radar maps, an algorithm named R SLAM has been developed. The global radar map is constructed through a data merging process, using map matching of successive radar image sequences. An occupancy grid approach is used to describe the environment. First results obtained in urban and natural environments are presented, which show the ability of the micro- wave radar to deal with extended environments.

Paper 162: Quality of Reconstructed Spectrum for Watermarked Spectral Images Subject to Various Illumination Conditions

Author(s): Konstantin Krasavin, Jussi Parkkinen, Aarto Kaarna, Timo Jaaskelainen

Digital imaging continues expansion to various applications. Spectral images are becoming more popular as one field of digital imaging. In this study we utilize a watermarking method for spectral images, based on the three- dimensional wavelet transform. We study the influence of watermarking process to illuminated watermarked images. In particular, we focus on how the watermarking effects to the spectrum of restored images. The experiments were performed on a large dataset of 58 spectral images. The experiments indicate that using the proposed watermarking method the quality of reconstructed image depends more on illumination and embedding strength controller than compression, with respect to L*a*b* color difference

Paper 163: Attributed Graph Matching Using Local Descriptions

Author(s): Salim Jouili, Ines Mili, Salvatore Tabbone

In the pattern recognition context, objects can be represented as graphs with attributed nodes and edges involving their relations. Consequently, matching attributed graphs plays an important role in objects recognition. In this paper, a node signatures extraction is combined with an optimal assignment method for matching attributed graphs. In particular, we show how local descriptions are used to define a node-to-node cost in an assignment problem using the Hungarian method. Moreover, we propose a distance formula to compute the distance between attributed graphs. The experiments demonstrate that the newly presented algorithm is well-suited to pattern recognition applications. Compared with well-known methods, our algorithm gives good results for retrieving images.

Paper 164: Decorrelation and Distinctiveness Provide with Human-like Saliency

Author(s): Antón Garcia-Diaz, Xosé Fdez-Vidal, Xosé Pardo, Raquel Dosil

In this work, we show the capability of a new model of saliency of reproducing remarkable psychophysical results. The model presents low computational complexity compared to other models of the state of the art. It is based in biologically plausible mechanisms: the decorrelation and the distinctiveness of local responses. Decorrelation of scales is obtained from principal component analysis of multiscale low level features. Distinctiveness is measured through the Hotelling's T2 statistic. The model is conceived to be used in a machine vision system, in which attention would contribute to enhance performance together with other visual functions. Experiments demonstrate the consistency with a wide variety of psychophysical phenomena that are referenced in the visual attention modeling literature, with results that outperform other state of the art models.

Paper 165: Evaluation of Interest Point Detectors for Non-Planar, Transparent Scenes

Author(s): Chrysi Papalazarou, Peter Rongen, Peter de With

The detection of stable, distinctive and rich feature point sets has been an active area of research in the field of video and image analysis. Transparency imaging, such as X-ray, has also benefited from this research. However, an evaluation of the performance of various available detectors for this type of images is lacking. The differences with natural imaging stem not only from the transparency, but -in the case of medical X-ray- also from the non-planarity of the scenes, a factor that complicates the evaluation. In this paper, a method is proposed to perform this evaluation on non-planar, calibrated X-ray images. Repeatability and accuracy of nine interest point detectors is demonstrated on phantom and clinical images. The evaluation has shown that the Laplacian-of-Gaussian and Harris-Laplace detectors show overall the best performance for the datasets used.

Paper 166: Vehicle Tracking using Geometric Features

Author(s): Francis Deboeverie, Kristof Teelen, Peter Veelaert, Wilfried Philips

Applications such as traffic surveillance require a real-time and accurate method for object tracking. We propose to represent scene observations with parabola segments with an algorithm that allows us to fit parabola segments in real-time to edge pixels. The motion vectors for these parabola segments are obtained in consecutive frames by a matching technique based on distance and intensity. Furthermore, moving rigid objects are detected by an original method that clusters comparable motion vectors. The result is a robust detection and tracking method, which can cope with small changes in viewpoint on the moving rigid object.

Paper 167: A Template Analysis Methodology to Improve the Efficiency of Fast Matching Algorithms

Author(s): Federico Tombari, Stefano Mattoccia, Luigi Di Stefano, Fabio Regoli, Riccardo Viti

Several methods aimed at effectively speeding up the block matching and template matching tasks have been recently proposed. A class of these methods, referred to as exhaustive due to the fact that they optimally solve the minimization problem of the matching cost, often deploys a succession of bounding functions based on a partitioning of the template and subwindow to perform rapid and reliable detection of non-optimal candidates. In this paper we propose a study aimed at improving the efficiency of one of these methods, that is, a state-of-the-art template matching technique known as Incremental Dissimilarity Approximations (IDA). In particular, we outline a methodology to order the succession of bounding functions deployed by this technique based on the analysis of the template only. Experimental results prove that the proposed approach is able to achieve improved efficiency.

Paper 168: On the Evaluation of Segmentation Methods for Wildland Fire

Author(s): Steve Rudz, Khaled Chetehouna, Adel Hafiane, Olivier Sero-guillaume, Hélène Laurent

This paper focuses on the study of fire color spaces and the evaluation of image segmentation methods commonly available in the literature of wildland and urban fires. The evaluation method, based on the determination of a segmentation quality index, is applied on three series of fire images obtained at the usual scales of validation of forest fire models (laboratory scale, fire tunnel scale and field scale). Depending on the considered scale, different methods reveal themselves as being the most appropriate. In this study we present the advantages and drawbacks of different segmentation algorithms and color spaces used in fire detection and characterization.

Paper 169: Enhanced Low-Resolution Pruning for Fast Full-Search Template Matching

Author(s): Stefano Mattoccia, Federico Tombari, Luigi Di Stefano

Gharavi-Alkhansari proposed a full-search equivalent algorithm for speeding-up template matching based on L_p-norm distance measures. This algorithm performs a pruning of mismatching candidates based on multilevel pruning conditions and it has been shown that, under certain assumptions on the distortion between the image and the template, it is faster than the other full-search equivalent algorithms proposed so far, including algorithms based on the Fast Fourier Transform. In this paper we propose an original contribution with respect to Gharavi-Alkhansari's work that is based on the exploitation of an initial estimation of the global minimum aimed at increasing the efficiency of the pruning process.

Paper 170: Parameter Estimation in Bayesian Super-Resolution Image Reconstruction From Low Resolution Rotated and Translated Images

Author(s): Salvador Villena, Miguel Vega, Rafael Molina, Aggelos Katsaggelos

This paper deals with the problem of high-resolution (HR) image reconstruction, from a set of degraded, under-sampled, shifted and rotated images, utilizing the variational approximation within the Bayesian paradigm. The proposed inference procedure requires the calculation of the covariance matrix of the HR image given the LR observations and the unknown hyperparameters of the probabilistic model. Unfortunately the size and complexity of such matrix renders its calculation impossible, and we propose and compare three alternative approximations. The estimated HR images are compared with images provided by other HR reconstruction methods.

Paper 173: Mixtures of Normalized Linear Projections

Author(s): Ahmed Otoom, Oscar Perez Concha, Hatice Gunes, Massimo Piccardi

High dimensional spaces pose a challenge to any classification task. In fact, these spaces contain much redundancy and it becomes crucial to reduce the dimensionality of the data to improve analysis, density modeling, and classification. In this paper, we present a method for dimensionality reduction in mixture models and its use in classification. For each component of the mixture, the data are projected by a linear transformation onto a lower- dimensional space. Subsequently, the projection matrices and the densities in such compressed spaces are learned by means of an Expectation Maximization (EM) algorithm. However, two main issues arise as a result of implementing this approach, namely: 1) the scale of the densities can be different across the mixture components and 2) a singularity problem may occur. We suggest solutions to these problems and validate the proposed method on three image data sets from the UCI Machine Learning Repository. The classification performance is compared with that of a mixture of probabilistic principal component analysers (MPPCA). Across the three data sets, our accuracy always compares favourably, with improvements ranging from 2.5% to 35.4%.

Paper 174: A New Approach to Sparse Image Representation Using MMV and K-SVD

Author(s): Jie Yang, Abdesselam Bouzerdoum, Phung Son

This paper addresses the problem of image representation based on a sparse decomposition over a learned dictionary. We propose an improved matching pursuit algorithm for Multiple Measurement Vectors (MMV) and an adaptive algorithm for dictionary learning based on multi-Singular Value Decomposition (SVD), and combine them for image representation. Compared with the traditional K-SVD and orthogonal matching pursuit MMV (OMPMMV) methods, the proposed method runs faster and achieves a higher overall reconstruction accuracy.

Paper 175: Supervised Face Recognition for Railway Stations Surveillance

Author(s): Maria Asuncion Vicente, Cesar Fernandez, Angela Coves

The feasibility of a supervised surveillance system for railway stations (or airports) is evaluated. Surveillance is based on suspicious recognition by means of video cameras. As the problem involves both face detection and face recognition, we have evaluated the best performing algorithms of these two areas. For face detection, we have selected the Viola-Jones algorithm; while for face recognition we have performed tests with an appearance based algorithm (PCA) and an interest-point based algorithm (SIFT).

We have used both the AT&T database and the LFW database for our tests. The results obtained show that face detection works reliably and fast enough, but face recognition cannot cope with highly non-homogeneous images like those of LFW and requires parallel computing in order to work in real time.

As a conclusion, supervised surveillance is feasible provided image homogeneity fulfils some minimum standards and parallel computing is used. Besides, interest-point based methods are more robust to image quality, so their use is encouraged

Paper 176: Retina Identification Based on the Pattern of Blood Vessels Using Angular and Radial Partitioning

Author(s): Mehran Deljavan Amiri, Fardin Akhlaqian Tab, Wafa Barkhoda

This paper presents a new human identification system based on features obtained from retina images using angular and radial partitioning of the images. The proposed algorithm is composed of two principal stages including feature extraction and decision making. In the feature extraction stage, first all of the images are normalized in a preprocessing step. Then, the blood vessels' pattern is extracted from retina images and a morphological thinning process is applied on the extracted pattern. After thinning, two feature vectors based on the angular and radial partitioning of the pattern image are extracted from the blood vessels' pattern. The extracted features are rotation and scale invariant and robust against translation. In the next stage, the extracted feature vectors are analyzed using 1D discrete Fourier transform and the Manhattan metric is used to measure the closeness of the feature vectors to have a compression on them. Experimental results on a database, including 360 retina images obtained from 40 subjects, demonstrated an average true identification accuracy rate equal to 98.75 percent for the proposed system.

Paper 177: Object Tracking by Non-Overlapping Distributed Camera Network

Author(s): Pier Luigi Mazzeo, Paolo Spagnolo, Tiziana D'Orazio

People Tracking is a problem of great interest for wide areas video surveillance systems. In these large areas, it is not possible for a single camera to observe the complete area of interest. Surveillance systems architecture requires algorithms with the ability to track objects while observing them through multiple cameras. We focus our work on multi camera tracking with non overlapping fields of view (FOV). In particular we propose a multi camera architecture for wide area surveillance and a real time people tracking algorithm across non overlapping cameras. In this scenario it is necessary to track object both in intra-camera and inter-camera FOV. We consider these problems in this paper. In particular we have investigated different techniques to evaluate intra-camera and inter-camera tracking based on color histogram. For the intra-camera tracking we have proposed different methodologies to extract the color histogram information from each object patches. For inter-camera tracking we have compared different methods to evaluate the colour Brightness Transfer Function (BTF) between non overlapping cameras. These approaches are based on color histogram mapping between pairs of images of the same object in different FOVs. Therefore we have combined different methodology to calculate the color histogram in order to estimate different colour BTF performances. Preliminary results demonstrates that the proposed method combined with BTF outperform the performance in terms of matching rate between different cameras.

Paper 178: 3D Filtering of Colour Video Sequences Using Fuzzy Logic and Vector Order Statistics

Author(s): Volodymyr Ponomaryov, Alberto Rosales-Silva, Francisco Gallegos-Funes

Novel approach designed in this paper permits the suppression of impulsive noise in multichannel video sequences. It employs the fuzzy logic and vector order statistic methods to detect motion and noise presence during spatial- temporal processing neighbouring video frames, preserving the edges, fine details, as well as colour properties. Numerous simulation results have justified it excellent performance in terms of objective criteria: Pick Signal- to- Noise Ratio (PSNR), Mean Absolute Error (MAE) and Normalized Colour Difference (NCD), as well as in subjective perception by human viewer.

Paper 182: Local Color Descriptor for Object Recognition Across Illumination Changes

Author(s): Xiaohu Song, Damien Muselet, Alain Trémeau

In the context of object recognition, it is useful to extract, from the images, e±cient local descriptors that are insensitive to the illumination conditions, to the camera scale factor and to the position and orientation of the object. In this paper, we propose to cope with this invariance problem by applying a spatial transformation to the local regions around detected key points. The new position of each pixel after this local spatial transformation is evaluated according to both the colors and the relative positions of all the pixels in the original local region. The descriptor of the considered local region is the set of the new positions of three particular pixels in this region. The invariance and the discriminating power of our local dscriptor is assessed on a public database.

Paper 183: 2D Face Recognition in the IV2 Evaluation Campaign

Author(s): Anouar Mellakh, Anis Chaari, Souhila Guerfi, Johan Dhose, Dijana Petrovska, Sylvie Lelandais, Joseph Colineau, Bernadette Dorizzi

In this paper, the first evaluation campaign on 2D-face images using the multimodal IV2 database is presented. The five appearance-based algorithms in competition are evaluated on four experimental protocols, including experiments with challenging illumination and pose variabilities. The results confirm the advantages of the Linear Discriminant Analysis (LDA) and the importance of the training set for the Principal Component Analysis (PCA) based approaches. The experiments show the robustness of the Gabor based approach combined with LDA, in order to cope with challenging face recognition conditions. This evaluation shows the interest and the richness of the IV2 multimodal database.

Paper 187: A Performance Comparison of De-Convolution Algorithms on Transmission Terahertz Images

Author(s): Yue Li, Li Li, Juan Tello, Dan Popescu, Andrew Hellicar

Terahertz imaging has found applications in many fields, to explore these applications we have built a coherent, transmission terahertz imaging system at 186 GHz. De-convolution algorithms were tested for improving the image's resolution beyond the diffraction limit of the imaging system. Tested algorithms include the Wiener, Tikhonov, and Richardson-Lucy algorithms. Their performances are discussed and compared in this paper. Experimental results have demonstrated that coherent de-convolution algorithms are capable of improving the resolution of images formed with this imaging system.

Paper 190: Phantom-Based Point Spread Function Estimation for Terahertz Imaging System

Author(s): Dan Popescu, Hellicar Andrew, Yue Li

We present a terahertz imaging system designed to operate in reflection mode and propose a method for estimating its point spread function. A phantom with known geometry is built, such as to generate a regular pattern with sharp edges under an ideal delta-like point spread function. The phantom is imaged with the terahertz system operating at 186 GHz. Several masking alterations applied to the beam pattern are also tested. The corresponding point spread functions are obtained by a deconvolution technique in the Fourier domain. We validate our results by using the estimated point spread functions to deblur the imaging results of a natural scene, and by direct comparison with a point source response.

Paper 191: Background Subtraction Techniques: Systematic Evaluation and Comparative Analysis

Author(s): Sonsoles Herrero, Jesús Bescós

Moving object detection is a critical task for many computer vision applications: the objective is the classification of the pixels in the video sequence into either foreground or background. A commonly used technique to achieve it in scenes captured by a static camera is Background Subtraction (BGS). Several BGS techniques have been proposed in the literature but a rigorous comparison that analyzes the different parameter configuration for each technique in different scenarios with precise ground-truth data is still lacking. In this sense, we have implemented and evaluated the most relevant BGS techniques, and performed a quantitative and qualitative comparison between them.

Paper 192: Advanced Vision Processing Systems: Spike-Based Sensing and Processing

Author(s): Jose-antonio Pérez-Carrasco, Carmen Serrano-Gotarredona, Begoña Acha-Piñero, Teresa Serrano-Gotarredona, Bernabe Linares-Barranco

In this paper we briefly summarize the fundamental properties of spike events processing applied to artificial vision systems. This sensing and processing technology is capable of very high speed throughput, because it does not rely on sensing and processing sequences of frames, and because it allows for complex hierarchically structured neurocortical-like layers for sophisticated processing. The paper describes briefly cortex-like spike event vision processing principles, and the AER (Address Event Representation) technique used in hardware spiking systems. In this paper we present a simulation AER tool that we have developed entirely in Visual C++ 6.0. We have validated it using real AER stimulus and comparing the outputs with real outputs obtained from devices based on AER. With this tool we can predict the eventual performance of AER systems, before the technology becomes mature enough to allow such large systems.

Paper 196: Self-Assessed Contrast-Maximizing Adaptive Region Growing

Author(s): Carlos S. Mendoza, Begoña Acha, Carmen Serrano, Tomás Gómez-Cía

In the context of an experimental virtual-reality surgical planning software platform, we propose a fully self-assessed adaptive region growing segmentation algorithm. Our method successfully delineates main tissues relevant to the simulation of reconstructive surgery procedures, such as skin, fat, muscle/organs, and bone. We rely on a standardized and self-assessed region-based approach to deal with a great variety of imaging conditions with minimal user intervention, as only a single-seed selection stage is required. The detection of the optimal parameters is managed internally using a measure of the varying contrast of the growing regions. Validation based on synthetic images, as well as truly-delineated real CT volumes, is provided for the reader's evaluation.

Paper 197: Pattern Analysis of Dermoscopic Images Based on FSCM Color Markov Random Fields

Author(s): Carlos S. Mendoza, Carmen Serrano, Begoña Acha

In this paper a method for pattern analysis in dermoscopic images of abnormally pigmented skin (melanocytic lesions) is presented. In order to diagnose a possible skin cancer, physicians assess the lesion according to different rules. The new trend in Dermatology is to classify the lesion by means of pattern irregularity. In order to analyze the pattern turbulence, lesions ought to be segmented into single pattern regions. Our classification method, when applied on overlapping lesion patches, provides a pattern chart that could ultimately allow for in-region single-texture turbulence analysis. Due to the color-textured appearance of these patterns, we present a novel method based on a Finite Symmetric Conditional Model (FSCM) Markov Random Field (MRF) color extension for the characterization and discrimination of pattern samples. Our classification success rate rises to 86%.

Paper 198: Lane Detection And Tracking Using A Layered Approach

Author(s): Amol Borkar, Monson Hayes, Mark Smith

A new night-time lane detection system that extends the idea of a Layered Approach [1] is presented in this document. The extension includes the incorporation of (1) Inverse Perspective Mapping (IPM) to generate a bird's-eye view of the road surface, (2) application of Random Sample Consensus (RANSAC) to rid outliers from the data, and (3) Kalman filtering to smooth the output of the lane tracker. Videos of driving scenarios on local city roads and highways were used to test the new system. Quantitative analysis shows higher accuracy in detecting lane markers in comparison to other approaches.

Paper 199: A New Feasible Approach to Multi-dimensional Scale Saliency

Author(s): Pablo Suau, Francisco Escolano

In this paper, we present a multi-dimensional extension of an image feature extractor, the scale saliency algorithm by Kadir and Brady. In order to avoid the curse of dimensionality, our algorithm is based on a recent Shannon's entropy estimator and on a new divergence metric in the spirit of Friedman's and Rafsky estimation of Henze-Penrose divergence. The experiments show that, compared to our previous existing method based on entropic graphs, this approach remarkably decreases computation time, while not significantly deterioring the quality of the results.

Paper 202: A Novel Approach to Geometric Fitting of Implicit Quadrics

Author(s): Mohammad Rouhani, Angel Sappa

This paper presents a novel approach for estimating the geometric distance from a given point to the corresponding implicit quadric curve/surface. The proposed estimation is based on the height of a tetrahedron, which is used as a coarse but reliable estimation of the real distance. The estimated distance is then used for finding the best set of quadric parameters, by means of the Levenberg-Marquardt algorithm, which is a common framework in other geometric fitting approaches. Comparisons of the proposed approach with previous ones are provided to show both improvements in CPU time as well as in the accuracy of the obtained results.

Paper 203: 3D Face Alignment via Cascade 2D Shape Alignment and Constrained Structure from Motion

Author(s): Yunshu Hou, Ping Fan, Ilse Ravyse, Hichem Sahli

In this paper, we consider fitting a 3D wireframe face model to continuous video sequences for the tasks of simultaneous tracking of rigid head motion and non-rigid facial animation. We propose a two-level integrated model for accurate 3D face alignment. At the low level, the 2D shape is accurately extracted via a regularized shape model relied on a cascaded parameter/constraint prediction and optimization. At the high level, those already accurately aligned points from the low level are used to constrain the projected 3D wireframe alignment. Using a steepest descent approach, the algorithm is able to extract simultaneously the parameters related to the face expression and to the 3D posture. Extensive fitting and tracking experiments demonstrate the feasibility, accuracy and effectiveness of the developed methods. A performance evaluation also shows that the proposed methods can outperform the fitting based on an active appearance model search and can tackle many disadvantages associated with such approaches.

Paper 204: Dynamic Texture Extraction And Video Denoising

Author(s): Mathieu Lugiez, Michel Ménard, Abdallah El-Hamidi

According to recent works, introduced by Yves Meyer the decomposition models based on Total Variation (TV) appear as very good way to extract texture from image sequences. Indeed, videos show up characteristic variations along the temporal dimension which can be catched in the decomposition framework. However, there are very few works in literature which deal with spatio-temporal decompositions. Thus, we devote this paper to spatio-temporal extension of the spatial color decomposition model. We provide a relevant method to accurately catch Dynamic Textures (DT) present in videos. Moreover, we obtain the spatio-temporal regularized part (the geometrical component), and we distinctly separate the highly oscillatory variations, (the noise). Furthermore, we present some elements of comparison between several models in denoising purpose.

Paper 205: Content-Based Annotation of User Generated Video on a Mobile Platform

Author(s): Hristina Pavlovska, Tomislav Kartalov, Zoran Ivanovski

This paper focuses on the problems of video annotation of outdoor user generated low bit-rate videos. An effective algorithm for estimating percentage of artificial (man-made) and natural content in a video frame is presented. The algorithm is based on edge information from the luminance frame only and it is intended to be executed on a mobile platform in an acceptable time frame. The experiments, performed on a large set of user generated videos, adhere with human perception of artificial and natural content.

Paper 208: Temporal Templates for Detecting the Trajectories of Moving Vehicles

Author(s): Hugo Jiménez, Joaquín Salas

In this study, we deal with the problem of detecting the trajectories of moving vehicles. We introduce a method, based on the spatio-temporal connectivity analysis, to extract the vehicles trajectories from temporal templates, spanned over a short period of time. Temporal templates are conformed with the successive images differences. The trajectories are computed using the centers of the blobs in the temporal template. A Kalman filter for a constant value with emphasis in the measurement uncertainty is used to smooth the result. The algorithm is tested extensively using a sequence took from tower overlooking a vehicular intersection. Our approach allow us to detect the vehicles trajectories without the need to construct a background model or using a sophisticated tracking strategy for the moving objects. Our experiments show that the scheme we propose is reliable, and fast.

Paper 209: Multiple Human Tracking in High-Density Crowds

Author(s): Irshad Ali, Matthew Dailey

In this paper, we present a fully automatic approach to multiple human detection and tracking in high density crowds in the presence of extreme occlusion. Human detection and tracking in high density crowds is an unsolved problem. Standard preprocessing techniques such as background modeling fail when most of the scene is in motion. We integrate human detection and tracking into a single framework, and introduce a confirmation-by-classification method to estimate confidence in a tracked trajectory, track humans through occlusions, and eliminate false positive detections. We use a Viola and Jones AdaBoost cascade classifier for detection, a particle filter for tracking, and color histograms for appearance modeling. An experimental evaluation shows that our approach is capable of tracking humans in high density crowds despite occlusions.

Paper 210: Theorems Relating Polynomial Aproximation, Orthogonality and Balancing Conditions for the Design of Nonseparable Bidimensional Multiwavelets

Author(s): Ana Ruedin

We relate different properties of nonseparable quincunx multiwavelet systems, such as polynomial approximation order, orthonormality and balancing, to conditions on the matrix filters. We give mathematical proofs for these relationships. The results obtained are necessary conditions on the filterbank. This simplifies the design of such systems.

Keywords: orthogonal filterbank, multiwavelets, nonseparable, polynomial reproduction.

Paper 216: Two-Level Bimodal Association for Audio-Visual Speech Recognition

Author(s): Jong-Seok Lee, Touradj Ebrahimi

This paper proposes a new method for bimodal information fusion in audio-visual speech recognition, where cross-modal association is considered in two levels. First, the acoustic and the visual data streams are combined at the feature level by using the canonical correlation analysis, which deals with the problems of audio-visual synchronization and utilizing the cross-modal correlation. Second, information streams are integrated at the decision level for adaptive fusion of the streams according to the noise condition of the given speech datum. Experimental results demonstrate that the proposed method is effective for producing noise-robust recognition performance without a priori knowledge about the noise conditions of the speech data.

Paper 218: Robust Detection and Tracking of Moving Objects in Traffic Video Surveillance

Author(s): Borislav Antic, Jorge Oswaldo Nino Castaneda, Dubravko Culibrk, Aleksandra Pizurica, Vladimir Crnojevic, Wilfried Philips

Building an efficient and robust system capable of working in harsh real world conditions represents the ultimate goal of the traffic video surveillance. Despite an evident progress made in the area of statistical background modeling over the last decade or so, moving object detection is still one of the toughest problems in video surveillance, and new approaches are still emerging. Based on our published method for motion detection in the wavelet domain, we propose a novel, wavelet-based method for robust feature extraction and tracking. Hereby, a more efficient approach is proposed that relies on a non-decimated wavelet transformation to achieve both motion segmentation and selection of features for tracking. The use of wavelet transformation for selection of robust features for tracking stems from the persistence of actual edges and corners across the scales of the wavelet transformation. Moreover, the output of the motion detector is used to limit the search space of the feature tracker to those areas where moving objects are found. The results demonstrate a stable and efficient performance of the proposed approach in the domain of traffic video surveillance.

Paper 219: Self Organizing and Fuzzy Modelling for Parked Vehicles Detection

Author(s): Lucia Maddalena, Alfredo Petrosino

Our aim is to distinguish moving and stopped objects in digital image sequences taken from stationary cameras by a model based approach. A self-organizing model is adopted both for the scene background and for the scene foreground, that can handle scenes containing moving backgrounds or gradual illumination variations, helping in distinguishing between moving and stopped foreground regions. The model is enriched by spatial coherence to enhance robustness against false detections and fuzzy modelling to deal with decision problems typically arising when crisp settings are involved. We show through experimental results and comparisons that good accuracy values can be reached for color video sequences that represent typical situations critical for vehicles stopped in no parking areas.

Paper 220: Concealed Object Perception and Recognition Using a Photometric Stereo Strategy

Author(s): Jiuai Sun, Melvyn Smith, Abdul Farooq, Lyndon Smith

Following a review of current hidden objects detection techniques in a range of security applications, a strategy based on an innovative, low-cost photometric stereo technique is proposed to reveal concealed objects. By taking advantage of information rapidly acquired under different illumination conditions, various enhanced real time images can be produced, free from the confusion of textured camouflage. The extracted surface normals can be used for the calculation of curvature and flatness attributes, and providing clues for subsequent hidden object detection and recognition tasks. Experiments on both simulated and real data have verified the strategy is useful for stealthy objects detection and may provide another modality of data for current monitoring system. The results demonstrate good potential application in the detection of concealed objects in security and military applications through the deployment of image enhancement and augmented reality devices.

Paper 221: Compression of Remote Sensing Images for the PROBA-V Satellite Mission

Author(s): Stefan Livens, Richard Kleihorst

We investigate compression of remote sensing images with a special geometry of non-square pixels. Two fundamentally different data reduction strategies are compared: a combination of pixel binning with near lossless compression and a method operating at higher compression rates. To measure the real impact of the compression, the image processing flow upto final products is included in the experiments. The effects on sensor non-uniformities and their corrections are explicitly modeled and measured. We conclude that it is preferable to apply higher compression rates than to rely on pixel binning, even if the derived images have lower resolutions.