Accueil DE EN ES FR


Advanced Search

Our On-Line PhDs

Submit a Thesis
My Account Register Help

About
Fields
Mathematics and Applications
Information and Communication Sciences and Technologies
Physics, Optics
Materials Science, Mechanics and Mechanical Engineering
Fluid Mechanics and Energy
Chemistry, Physical Chemistry and Chemical Engineering
Life Sciences and Engineering
Earth Sciences and Environmental Engineering
Sciences of Economy, Management and Society
Conception et mise en oeuvre d'algorithmes de vision temps-réel pour la vidéo surveillance intelligente

Ghorayeb, Hicham (2007) Conception et mise en oeuvre d'algorithmes de vision temps-réel pour la vidéo surveillance intelligente. PhD thesis Informatique temps réel, robotique et automatique, CAOR- Centre de robotique, ENSMP p.197.

Full text available as:

- GhorayebThesis.pdf ( 11380 Kb )
Licence: Copyright

Abstract

In this dissertation, we present our research work held at the Center of Robotics (CAOR) of the Ecole des Mines de Paris which tackles the problem of intelligent video analysis.

The primary objective of our research is to prototype a generic framework for intelligent video analysis. We optimized this framework and configured it to cope with specific application requirements. We consider a people tracker application extracted from the PUVAME project. This application aims to improve people security in urban zones near to bus stations.

Then, we have improved the generic framework for video analysis mainly for background subtraction and visual object detection. We have developed a library for machine learning specialized in boosting for visual object detection called LibAdaBoost.

To the best of our knowledge LibAdaBoost is the first library in its kind. We make LibAdaBoost available for the machine learning community under the LGPL license.

Finally we wanted to adapt the visual object detection algorithm based on boosting so that it could run on the graphics hardware. To the best of our knowledge we were the first to implement visual object detection with sliding technique on the graphics hardware. The results were promising and the prototype performed three to nine times better than the CPU.

The framework was successfully implemented and integrated to the RTMaps environment.

It was evaluated at the final session of the project PUVAME and demonstrated its fiability over various test scenarios elaborated specifically for the PUVAME project.

Item Type:PhD Thesis (PhD)
PhD Supervisor:Laurgeau, Claude
Date:12 September 2007
Board of examiners:Meyrueis, Patrick and Siarry, Patrick and Akil, Mohamed and Steux, Bruno and Laurgeau, Claude and Meyer, Fernand and Schwerdt, Karl
Discipline:Informatique temps réel, robotique et automatique
Collection (Fonds):Mines ParisTech (ENSMP)
Institution:ENSMP
Department:CAOR- Centre de robotique
Subjects:1. Mathematics and Applications
Uncontrolled Keywords:Vidéo surveillance, Boosting, Reconnaissance automatique des formes, Système de transport intelligent, Apprentissage automatique, Détection objet en mouvement, méthode Monte Carlo
ID Code:3064
Deposited By:Claudine Abauzit
Deposited On:05 November 2007

References

[Abr06] Y. Abramson. Pedestrian detection for intelligent transportation systems.

EMP Press, 2006.

[Ame03] A. Amer. Voting-based simultaneous tracking of multiple video objects.

In Proc. SPIE Int. Symposium on Electronic Imaging, pages 500–511,

2003.

[ASG05] Y. Abramson, B. Steux, and H. Ghorayeb. Yef real-time object detection.

In ALART’05:International workshop on Automatic Learning and

Real-Time, pages 5–13, 2005.

[BA04] J. Bobruk and D. Austin. Laser motion detection and hypothesis tracking

from a mobile platform. In Australasian Conference on Robotics and

Automation (ACRA), 2004.

[BER03] J. Black, T. Ellis, and P. Rosin. A novel method for video tracking

performance evaluation. In International Workshop on Visual Surveillance

and Performance Evaluation of Tracking and Surveillance, pages

125–132, 2003.

[BFH04a] I. Buck, K. Fatahalian, and Hanrahan. Gpubench: Evaluating gpu performance

for numerical and scientific applications. Proceedings of the

2004 ACM Workshop on General-Purpose Computing on Graphics Processors,

pages C–20, Aug 2004.

[BFH+04b] I. Buck, T. Foley, D. Horn, J. Sugerman, K. Mike, and H. Pat. Brook

for gpus: Stream computing on graphics hardware, 2004.

[BMGE01] T. Boult, R. Micheals, X. Gao, and M. Eckmann. Into the woods:

visual surveillance of noncooperative and camouflaged targets in complex

outdoor settings, 2001.

[BS03] H. Ghorayeb B. Steux, Y. Abramson. Camellia image processing library,

2003.

[CG00] C.Stauffer and W.E.L. Grimson. Adaptive background mixture models

for real-time tracking. IEEE Transactions on pattern analysis and

machine intelligence, pages 747–757, August 2000.



[CL04] B. Chen and Y. Lei. Indoor and outdoor people detection and shadow

suppression by exploiting hsv color information. cit, 00:137–142, 2004.

[Cro84] F. C. Crow. Summed-area tables for texture mapping. In SIGGRAPH

’84: Proceedings of the 11th annual conference on Computer graphics

and interactive techniques, pages 207–212, New York, NY, USA, 1984.

ACM Press.

[CSW03] H. Cramer, U. Scheunert, and G.Wanielik. Multi sensor fusion for object

detection using generalized feature models. In International Conference

on Information Fusion, 2003.

[Ded04] Y Dedeoglu. Moving object detection, tracking and classification for

smart video surveillance, 2004.

[DHS00] Richard O. Duda, Peter E. Hart, and David G. Stork. Pattern Classification

(2nd Edition). Wiley-Interscience, 2000.

[DM00] D. Doermann and D. Mihalcik. Tools and techniques for video performance

evaluation. Proceedings of the International Conference on

Pattern Recognition (ICPR00), pages 4167–4170, September 2000.

[DSS93] H. Drucker, R. Schapire, and P. Simard. Boosting performance in neural

networks. International Journal of Pattern Recognition and Artificial

Intelligence, 7(4):705–719, 1993.

[Elf89] A. Elfes. Using occupancy grids for mobile robot perception and navigation.

Computer, 22:46–57, June 1989.

[EWN04] Magnus Ekman, Fredrik Warg, and Jim Nilsson. An in-depth look at

computer performance growth. Technical Report 04-9, Department of

Computer Science and Engineering, Chalmers University of Technology,

2004.

[FH06] J.P. Farrugia and P. Horain. Gpucv: A framework for image processing

acceleration with graphics processors. In International Conference on

Multimedia and Expo (ICME), Toronto, Ontario, Canada, July 9–12

2006.

[FM05] J. Fung and S. Mann. Openvidia: parallel gpu computer vision. In

MULTIMEDIA ’05: Proceedings of the 13th annual ACM international

conference on Multimedia, pages 849–852, New York, NY, USA, 2005.

ACM Press.

[Fre90] Y. Freund. Boosting a weak learning algorithm by majority. In Proceedings

of the Third Annual Workshop on Computational Learning Theory,

pages 202–216, August 1990.



[FS95a] Y. Freund and R. E. Schapire. A decision-theoretic generalization of

on-line learning and an application to boosting. In European Conference

on Computational Learning Theory, pages 23–37, 1995.

[FS95b] Y. Freund and R. E. Schapire. A decision-theoretic generalization of online

learning and an application to boosting. In Computational Learning

Theory: Second European Conference, EuroCOLT ’95, pages 23–37.

Springer-Verlag, 1995.

[FS99] Yoav Freund and R. E. Schapire. A short introduction to boosting.

Journal of Japanese Society for Artificial Intelligence, 14(5):771–780,

Sep 1999. Appearing in Japanese, translation by Naoki Abe.

[Gho06] H. Ghorayeb. Libadaboost: developer guide, 2006.

[HA05] D. Hall and Al. Comparison of target detection algorithms using adaptive

background models. INRIA Rhone-Alpes, France and IST Lisbon,

Portugal and University of Edinburgh,UK, pages 585–601, 2005.

[HBC+03] A. Hampapur, L. Brown, J. Connell, S. Pankanti, A. Senior, and Y. Tian.

Smart surveillance: Applications, technologies and implications, 2003.

[HBC06] Ghorayeb H., Steux B., and Laurgeau C. Boosted algorithms for visual

object detection on graphics processing units. In ACCV06: Asian Conference

on Computer Vision 2006, pages 254–263, Hyderabad, India,

2006.

[Hei96] F. Heijden. Image Based Measurement Systems: Object Recognition and

Parameter Estimation. Wiley, 1996.

[Int06] Intel. Intel processors product list, 2006.

[JM02] P.Perez C.Hue J.Vermaak and M.Gangnet. Color-based probabilistic

tracking. IEEE Transactions on multimedia, 2002.

[JRO99] J.Staufer, R.Mech, and J. Ostermann. Detection of moving cast shadows

for object segmentation. IEEE Transactions on multimedia, pages 65–

76, March 1999.

[JWSX02] C. Jaynes, S. Webb, R. Steele, and Q. Xiong. An open development

environment for evaluation of video surveillance systems, 2002.

[Ka04] Kenji.O and all. A boosted particle filter multi–target detection and

tracking. ICCV, 2004.

[Kal60] E. Kalman. A new approach to linear filtering and prediction problems.

Transactions of the ASME-Journal of Basic Engineering, 82:35–

45, 1960.



[KDdlF+04] M. Kais, S. Dauvillier, A. de la Fortelle, I. Masaki, and C. Laugier.

Towards outdoor localization using gis, vision system and stochastic

error propagation. In International Conference on Autonomous Robots

and Agents, December 2004.

[KNAL05] A. Khammari, F. Nashashibi, Y. Abramson, and C. Laurgeau. Vehicle

detection combining gradient analysis and adaboost classification. In

Intelligent Transportation Systems, 2005. Proceedings. 2005 IEEE, pages

66– 71, September 2005.

[KV88] M. Kearns and L. G. Valiant. Learning boolean formula or finite automate

is as hard as factoring. Technical Report TR-14-88, Harvard

University Aiken Computation Laboratory, August 1988.

[KV89] Michael Kearns and Leslie G. Valiant. Cryptographic limitations on

learning boolean formula and finite automate. In Proceedings of the

Twenty First Annual ACM Symposium on Theory of Computing, pages

433–444, May 1989.

[KV94] Michael J. Kearns and Umesh V. Vazirani. An Introduction to Computational

Learning Theory. MIT Press, 1994.

[nVI07a] nVIDIA. nvidia graphic cards, 2007.

[nVI07b] nVIDIA. Cuda programming guide: Nvidia confidential, prepared and

provided under nda, 21 nov 2007.

[OPS+97] M. Oren, C. Papageorgiou, P. Sinha, E. Osuna, and T. Poggio. Pedestrian

detection using wavelet templates. In Proc. Computer Vision and

Pattern Recognition, pages 193–199, June 1997.

[PEM06] T. Parag, A. Elgammal, and A. Mittal. A framework for feature selection

for background subtraction. In CVPR ’06: Proceedings of the

2006 IEEE Computer Society Conference on Computer Vision and Pattern

Recognition, pages 1916–1923, Washington, DC, USA, 2006. IEEE

Computer Society.

[RE95] P.L. Rosin and T. Ellis. Image difference threshold strategies and shadow

detection. In Proc: 6th BMVC 1995 conf., pages 347–356, 1995.

[RSR+02] M. Rochery, R. Schapire, M. Rahim, N. Gupta, G. Riccardi, S. Bangalore,

H. Alshawi, and S. Douglas. Combining prior knowledge and

boosting for call classification in spoken language dialogue. In International

Conference on Accoustics, Speech and Signal Processing, 2002.

[SAG03a] B. Steux, Y. Abramson, and H. Ghorayeb. Initial algorithms 1, delivrable

3.2b, project ist-2001-34410, public report, 2003.



[SAG03b] B. Steux, Y. Abramson, and H. Ghorayeb. Report on mapped algorithms,

delivrable 3.5, projet ist-2001-34410, internal report, 2003.

[Sch89] R. E. Schapire. The strength of weak learnability. In 30th Annual Symposium

on Foundations of Computer Science, pages 28–33, October 1989.

[SDK05] R. Strzodka, M. Doggett, and A. Kolb. Scientific computation for simulations

on programmable graphics hardware. Simulation Modelling

Practice and Theory, Special Issue: Programmable Graphics Hardware,

13(8):667–680, Nov 2005.

[SEN98] J. Steffens, E. Elagin, and H. Neven. Person spotter-fast and robust

system for human detection. In Proc. of IEEE Intl. Conf. on Automatic

Face and Gesture Recognition, pages 516–521, 1998.

[SS98] R. E. Schapire and Y. Singer. Improved boosting algorithms using

confidence-rated predictions. In Proceedings of the Eleventh Annual

Conference on Computational Learning Theory, pages 80–91, 1998. To

appear, Machine Learning.

[TDD99] T.Horprasert, D.Harwood, and L.S. Davis. A statistical approach for

real-time robust background subtraction and shadow detection. Proceedings

of International Conference on computer vision, 1999.

[TKBM99] K. Toyama, J. Krumm, B. Brumitt, and B. Meyers. Wallflower: Principles

and practice of background maintenance. In International Conference

on Computer Vision (ICCV), volume 1, pages 255–261, 1999.

[Val84] L. G. Valiant. A theory of the learnable. Communications of the ACM,

27(11):1134–1142, November 1984.

[VJ01a] P. Viola and M. Jones. Rapid object detection using a boosted cascade

of simple features. European Conference on Computational Learning

Theory, 2001.

[VJ01b] Paul Viola and Michael Jones. Rapid object detection using a boosted

cascade of simple features. In Proceedings IEEE Conf. on Computer

Vision and Pattern Recognition, pages 511–518, 2001.

[VJJR02] V.Y.Marianoand, J.Min, JH.Park, and R.Kasturi. Performance evaluation

of object detection algorithms. In International Workshop on Visual

Surveillance and Performance Evaluation of Tracking and Surveillance,

pages 965–969, 2002.

[VJS03] Paul Viola, Michael J. Jones, and Daniel Snow. Detecting pedestrians

using patterns of motion and appearance. In IEEE International

Conference on Computer Vision, pages 734–741, Nice, France, October

2003.



[WATA97] C. Wren, A.Azarbayejani, T.Darrell, and A.Pentland. Pfinder:real-time

tracking of the human body. IEEE Transactions on pattern analysis and

machine intelligence, 19:780–785, July 1997.

[WHT03] L. Wang, W. Hu, and T. Tan. Recent developments in human motion

analysis. In Proceedings IEEE Conf. on Computer Vision and Pattern

Recognition, page 585601, 2003.

[WWT03] L. Wang, W.HU, and T.Tan. Recent developments in human motion

analysis. Pattern Recognition, 36:585–601, March 2003.

[YARL06] M. Yguel, O. Aycard, D. Raulo, and C. Laugier. Grid based fusion

of offboard cameras. In IEEE International Conference on Intelligent

Vehicules, 2006.

[ZJHW06] H. Zhang, W. Jia, X. He, and Q. Wu. Learning-based license plate

detection using global and local features. In Pattern Recognition, 2006.

ICPR 2006. 18th International Conference on, pages 1102–1105, August

2006.

Table of content

I Introduction and state of the art 1

1 French Introduction 2

1.1 Algorithmes - 4

1.2 Architecture - 4

1.3 Application - 5

2 Introduction 6

2.1 Contributions - 6

2.2 Outline - 7

3 Intelligent video surveillance systems (IVSS) 9

3.1 What is IVSS? - 10

3.1.1 Introduction - 10

3.1.2 Historical background - 10

3.2 Applications of intelligent surveillance - 11

3.2.1 Real time alarms - 11

3.2.2 User defined alarms - 11

3.2.3 Automatic unusual activity alarms - 12

3.2.4 Automatic Forensic Video Retrieval (AFVR) - 12

3.2.5 Situation awareness - 13

3.3 Scenarios and examples - 13

3.3.1 Public and commercial security - 13

3.3.2 Smart video data mining - 14

3.3.3 Law enforcement - 14

3.3.4 Military security - 14

3.4 Challenges - 14

3.4.1 Technical aspect - 14

3.4.2 Performance evaluation - 15

3.5 Choice of implementation methods - 15

3.6 Conclusion - 16



II Algorithms 17

4 Generic framework for intelligent visual surveillance 18

4.1 Object detection - 19

4.2 Object classification - 19

4.3 Object tracking - 20

4.4 Action recognition - 20

4.5 Semantic description - 21

4.6 Personal identification - 21

4.7 Fusion of data from multiple cameras - 21

5 Moving object detection 22

5.1 Challenge of detection - 23

5.2 Object detection system diagram - 25

5.2.1 Foreground detection - 26

5.2.2 Pixel level post-processing (Noise removal) - 27

5.2.3 Detecting connected components - 28

5.2.4 Region level post-processing - 28

5.2.5 Extracting object features - 28

5.3 Adaptive background differencing - 29

5.3.1 Basic Background Subtraction (BBS) - 29

5.3.2 W4 method - 30

5.3.3 Single Gaussian Model (SGM) - 31

5.3.4 Mixture Gaussian Model (MGM) - 32

5.3.5 Lehigh Omni-directional Tracking System (LOTS): - 33

5.4 Shadow and light change detection - 35

5.4.1 Methods and implementation - 36

5.5 High level feedback to improve detection methods - 46

5.5.1 The modular approach - 47

5.6 Performance evaluation - 48

5.6.1 Ground truth generation - 48

5.6.2 Datasets - 48

5.6.3 Evaluation metrics - 49

5.6.4 Experimental results - 54

5.6.5 Comments on the results - 56

6 Machine learning for visual object-detection 57

6.1 Introduction - 58

6.2 The theory of boosting - 58

6.2.1 Conventions and definitions - 58

6.2.2 Boosting algorithms - 60

6.2.3 AdaBoost - 61

6.2.4 Weak classifier - 63

6.2.5 Weak learner - 63



6.3 Visual domain - 66

6.3.1 Static detector - 67

6.3.2 Dynamic detector - 68

6.3.3 Weak classifiers - 68

6.3.4 Genetic weak learner interface - 75

6.3.5 Cascade of classifiers - 76

6.3.6 Visual finder - 77

6.4 LibAdaBoost: Library for Adaptive Boosting - 80

6.4.1 Introduction - 80

6.4.2 LibAdaBoost functional overview - 81

6.4.3 LibAdaBoost architectural overview - 85

6.4.4 LibAdaBoost content overview - 86

6.4.5 Comparison to previous work - 87

6.5 Use cases - 88

6.5.1 Car detection - 89

6.5.2 Face detection - 90

6.5.3 People detection - 91

6.6 Conclusion - 92

7 Object tracking 98

7.1 Initialization - 98

7.2 Sequential Monte Carlo tracking - 99

7.3 State dynamics - 100

7.4 Color distribution Model - 100

7.5 Results - 101

7.6 Incorporating Adaboost in the Tracker - 102

7.6.1 Experimental results - 102

III Architecture 107

8 General purpose computation on the GPU 108

8.1 Introduction - 108

8.2 Why GPGPU? - 109

8.2.1 Computational power - 109

8.2.2 Data bandwidth - 110

8.2.3 Cost/Performance ratio - 112

8.3 GPGPU’s first generation - 112

8.3.1 Overview - 112

8.3.2 Graphics pipeline - 114

8.3.3 Programming language - 119

8.3.4 Streaming model of computation - 120

8.3.5 Programmable graphics processor abstractions - 121

8.4 GPGPU’s second generation - 123



8.4.1 Programming model - 124

8.4.2 Application programming interface (API) - 124

8.5 Conclusion - 125

9 Mapping algorithms to GPU 126

9.1 Introduction - 126

9.2 Mapping visual object detection to GPU - 128

9.3 Hardware constraints - 130

9.4 Code generator - 131

9.5 Performance analysis - 133

9.5.1 Cascade Stages Face Detector (CSFD) - 133

9.5.2 Single Layer Face Detector (SLFD) - 134

9.6 Conclusion - 135

IV Application 137

10 Application: PUVAME 138

10.1 Introduction - 138

10.2 PUVAME overview - 139

10.3 Accident analysis and scenarios - 140

10.4 ParkNav platform - 141

10.4.1 The ParkView platform - 142

10.4.2 The CyCab vehicule - 144

10.5 Architecture of the system - 144

10.5.1 Interpretation of sensor data relative to the intersection - 145

10.5.2 Interpretation of sensor data relative to the vehicule - 149

10.5.3 Collision Risk Estimation - 149

10.5.4 Warning interface - 150

10.6 Experimental results - 151

V Conclusion and future work 153

11 Conclusion 154

11.1 Overview - 154

11.2 Future work - 155

12 French conclusion 157

VI Appendices 161

A Hello World GPGPU 162



B Hello World Brook 170

C Hello World CUDA 181

Statistiques de consultation

Repository Staff Only: edit this item

© ParisTech 2007 - Réalisé par RILK.com - Graphisme par Winch Communication