• Bahadır Karasulu
Keywords: Image processing, Object detection, Object tracking, Performance metrics, Evaluation


Moving object detection and tracking (D&T) are important initial steps in object recognition, context analysis and indexing processes for visual surveillance systems. It is a big challenge for researchers to make a decision on which D&T algorithm is more suitable for which situation and/or environment and to determine how accurately object D&T (real-time or non-real-time) is made. There is a variety of object D&T algorithms (i.e. methods) and publications on their performance comparison and evaluation via performance metrics. This paper provides a systematic review of these algorithms and performance measures and assesses their effectiveness via metrics.


[1] Dey, A. K., “Understanding and using context”, Journal of Personal and Ubiquitous Computing, Vol. 5, No. 1, 47:4-7, 2001.
[2] Brdiczka, O., Yuen, P. C., Zaidenberg, S., Reignier, P., and Crowley, J. L., “Automatic acquisition of context models and its application to video surveillance”, In Proceedings of the 18th international conference on pattern recognition (ICPR’06), pp. 1175-1178, Hong Kong, August 2006.
[3] Sáncheza, A. M., Patricio, M. A., Garcia, J., and Molina, J.M., "A Context Model and Reasoning System to improve object tracking in complex scenarios", Expert Systems with Applications, doi: 10.1016/j.eswa.2009.02.096, 2009.
[4] Bennett, B., Magee, D.R., Cohn, A. G., and Hogg, D.C., "Enhanced tracking and recognition of moving objects by reasoning about spatio-temporal continuity", Image and Vision Computing, Cognitive Vision-Special Issue, Vol. 26, No. 1, pp. 67-81, 2008.
[5] Carmona, E. J., Martínez-Cantos, J., and Mira, J., “A new video segmentation method of moving objects based on blob-level knowledge”, Pattern Recognition Letters, Vol. 29, No. 3, pp. 272-285, 2008.
[6] Kim, J.B., Kim, H.J., “Efficient region-based motion segmentation for a video monitoring system”, Pattern Recognition Letters, Vol. 24, No. 1–3, pp. 113–128, 2003.
[7] Bradski, G. “Computer Vision Face Tracking For Use in a Perceptual User Interface”, In Intel Technology Journal, ( articles/art_2.htm, (Q2 1998).
[8] François, A. R. J., "CAMSHIFT Tracker Design Experiments with Intel OpenCV and SAI", IRIS Technical Report IRIS-04-423, University of Southern California, Los Angeles, 2004.
[9] Comaniciu, D., Meer, P., "Mean Shift Analysis and Applications", IEEE International Conference Computer Vision (ICCV'99), Kerkyra, Greece, pp. 1197-1203, 1999.
[10] Jodoin, P.M., Mignotte, M., “Optical-flow based on an edge-avoidance procedure”, Computer Vision and Image Understanding, Vol. 113, No. 4, pp. 511-531, 2009.
[11] Pauwels, K., Van Hulle, M. M., “Optic flow from unstable sequences through local velocity constancy maximization”, Image and Vision Computing, (The 17th British Machine Vision Conference (BMVC 2006)), Vol. 27, No. 5, pp. 579-587, 2009.
[12] Kass, M., Witkin, A., and Terzopoulos, D., “Snakes: active contour models”, International Journal of Computer Vision, Vol. 1, No. 4, pp. 321–331, 1988.
[13] Dagher, I., Tom, K. E., “WaterBalloons: A hybrid watershed Balloon Snake segmentation”, Image and Vision Computing, Vol. 26, pp. 905–912, doi:10.1016/j.imavis.2007.10.010, 2008.
[14] Nixon, M. S., Aguado, A. S., "Feature Extraction and Image Processing", Elsevier Science Ltd, 2002, ISBN:0750650788.
[15] Sankaranarayanan, A.C., Veeraraghavan, A., Chellappa, R., "Object Detection, Tracking and Recognition for Multiple Smart Cameras", Proceedings of the IEEE, Vol. 96, No. 10, pp. 1606-1624,ISSN:0018-9219, 2008.
[16] Loza, A., Mihaylova, L., Canagarajah, N., and Bull, D., “Structural Similarity-Based Object Tracking in Video Sequences”, In: The 9-th International Conference on Information Fusion, 10-13 July 2006, Florence, Italy, 2006.
[17] Aherne, F., Thacker, N., and Rockett, P., “The Bhattacharyya metric as an absolute similarity measure for frequency coded data”, Kybernetica, 32(4):1–7, 1997.
[18] Russell, S. J., Norvig, P., "Artificial Intelligence: A Modern Approach, Second Edition", Pearson Education, Inc., Upper Saddle River; New Jersey 07458, ISBN: 0-13-790395-2, 2003.
[19] Bowyer, K.W., Phillips, P.J., “Empirical evaluation techniques in computer vision”, Wiley-IEEE Computer Society Press, 1998.
[20] Camara-Chavez, G., Precioso, F., and Cord, M., Phillip-Foliguet, S., de A. Araujo, A., "An interactive video content-based retrieval system", Systems, Signals and Image Processing, 2008. IWSSIP 2008. 15th International Conference on 25-28 June 2008, pp.133-136, 2008.
[21] Yilmaz, A., Javed, O., Shah, M., “Object tracking: A survey”, ACM Computing Surveys, Vol. 38, No. 4, Article No. 13 (Dec. 2006), 45 pages, doi: 10.1145/1177352.1177355, 2006.
[22] Benezeth, Y., Jodoin, P.M., Emile, B., Laurent, H., and Rosenberger, C., “Review and evaluation of commonly-implemented background subtraction algorithms”, International Conference on Pattern Recognition (ICPR 2008), 19th Publication Date: 8-11 Dec. 2008, pp. 1-4, 2008.
[23] Cheung, S.-C., Kamath, C. "Robust techniques for background subtraction in urban traffic video" Video Communications and Image Processing, SPIE Electronic Imaging, San Jose, January 2004, UCRL-JC-153846-ABS, UCRL-CONF-200706, 2004.
[24] Fuentes, L., Velastin, S., “From tracking to advanced surveillance”, In Proceedings of IEEE International Confererence on Image Processing, Barcelona,Spain,2003.
[25] Rittscher, J., Kato, J., Joga, S., and Blake, A., "A probabilistic background model for tracking", In European Conference on Computer Vision (ECCV), Vol. 2, pp. 336–350, 2000.
[26] Stenger, B., Ramesh, V., Paragios, N., Coetzee, F., and Buhmann, J., "Topology free hidden markov models: Application to background modeling", In IEEE International Conference on Computer Vision (ICCV), pp. 294–301, 2001.
[27] Comaniciu, D., Meer, P., “Mean shift: A robust approach toward feature space analysis”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603–619, 2002.
[28] Shan, C., Tan, T., Wei, Y., "Real-time hand tracking using a mean shift embedded particle filter", Pattern Recognition, Vol. 40, No. 7, pp. 1958-1970, 2007.
[29] Bradski, G., Kaehler, A., "Learning OpenCV: Computer Vision with the OpenCV Library", O’Reilly Media, Inc. Publication, 1005 Gravenstein Highway North, Sebastopol, CA 95472, 2008.
[30] Porikli, F., "Automatic Video Object Segmentation", Ph.D. Thesis, Electrical and Computer Engineering, Polytechnic University, Brooklyn, NY, 2002.
[31] Horn, B., Schunk, B., "Determining optical flow", Artificial Intelligence, Vol. 17, pp. 185–203, 1981.
[32] Lucas, B., Kanade, T., “An iterative image registration technique with an application to stereo vision,” In Proceedings of Image Understanding
Workshop, pp. 121–130, (In International Joint Conference on Artificial Intelligence), 1981.
[33] Schunk, B., "The image flow constraint equation", Computer Vision Graphics Image Process, Vol. 35, pp. 20–46, 1986.
[34] Shi, J., Tomasi, C., "Good features to track", In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 593–600, 1994.
[35] Gouet-Brunet, V., Lameyre, B., "Object recognition and segmentation in videos by connecting heterogeneous visual features", Computer Vision and Image Understanding, Vol. 111, No. 1, pp. 86-109, Special Issue on Intelligent Visual Surveillance (IEEE), 2008.
[36] Baumann, A., Boltz, M., and Ebling, J. et al., “A Review and Comparison of Measures for Automatic Video Surveillance Systems”, EURASIP Journal on Image and Video Processing, Vol. 2008, Article ID: 824726, 30 pages, doi:10.1155/2008/824726, 2008.
[37] Bashir, F., Porikli, F., “Performance Evaluation of Object Detection and Tracking Systems”, In Proceedings 9th IEEE International Workshop on PETS, pp. 7-14, Newyork, June 18, 2006.
[38] Black, J., Ellis, T., Rosin, P., “A novel method for video tracking performance evaluation”, In Proceedings of the IEEE InternationalWorkshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (VS-PETS 03), pp. 125–132, Nice, France, October 2003.
[39] CAVIAR, “Context aware vision using image-based active recognition”, (URL: http://homepages., 2009.
[40] VIPeR, “Viewpoint Invariant Pedestrian Recognition”, (URL: 178), 2009.
[41] IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS), (URL:, 2009.
[42] VACE, “Video analysis and content extraction”, (URL: brochure.pdf), 2009.
[43] Kasturi, R, Goldgof, D., Soundararajan, P., Manohar, V., Garofolo, J., Bowers, R., Boonstra, M., Korzhova, V., and Zhang, J., "Framework for Performance Evaluation of Face, Text, and Vehicle Detection and Tracking in Video: Data, Metrics, and Protocol," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 31, No. 2, pp. 319-336, 2009.
[44] Thirde, D., Borg, M., Aguilera, J., Wildenauer, H., Ferryman, J., and Kampel, M., “Robust Real-Time Tracking for Visual Surveillance”, EURASIP Journal on Advances in Signal Processing, Vol. 2007, Article ID 96568, 23 pages, doi:10.1155/2007/96568, 2007.
[45] Lazarevic-McManus, N., Renno, J.R., Makris, D., and Jones, G. A., "An object-based comparative methodology for motion detection based on the F-Measure", Computer Vision and Image Understanding, Vol. 111, No.1, pp. 74-85, Special Issue on Intelligent Visual Surveillance (IEEE), 2008.
How to Cite