Denis Lalanne Florian Evequoz

Goals of the seminar

  1. Present both an oral and written report of a bibliographic research
  2. Practice the reviewing of scientific literature

The topic of this seminar is gesture recognition, with a particular focus on video based hand and arm gesture recognition. Students will have to choose one of the six following steps of gesture recognition :

  • Detection
  • Tracking
  • Segmentation
  • Recognition
  • Features extraction (used in detection and recognition steps)
  • 2D vs. 3D

Their task will then be to make a bibliographic search on their subject of choice to be presented orally and in a written report (goal 1) of 4 pages, authored in LaTeX following ACM Strict format.

Moreover, three sessions of the seminar will be devoted to a "reviewing-club" (goal 2). Participants will have to read provided short articles and write a review of them. These reviews will be discussed together during the sessions. The goal of the reviewing-club is to develop the critical skill of students regarding the way scientific literature should be written.


If you are willing to participate to this seminar, please contact the organizers. Max. 6 students will be accepted.



To drive their research, students will use the proposed bibliographic references, as well as other references they'll consider interesting, and will try to answer the following questions. The expected outcomes of this seminar is an article synthesizing their findings on the topic as well as a presentation to the participants of the seminar. The article must be authored in LaTeX following ACM Strict format, on 4 pages.

  • Detection of hands and harms

    • Goal?
    • Associated challenges?
    • Which techniques can be used?
    • Main problems which can occur?
      • Lighting
      • Background dynamics
    • Gesture recognition without markers, how to extract the skin from an image?
  • Tracking

    • Goal?
    • Associated challenges?
    • For what purpose is it used?
    • Which techniques can be used?
    • Main problems that can occur?
    • Is this step in the gesture recognition really necessary?
    • Minimal frame rate (FPS) for real time gesture recognition
  • Segmentation of gestures

    • Goal?
    • Which techniques can be used?
    • Main problems that can occur?
    • Do we have to use multimodalities?
    • Can it be done automatically?
  • Recognition

    • Goal?
    • Associated challenges?
    • Which techniques can be used?
    • Main problems that can occur?
    • Why Hidden Markov Models (HMM)?
    • Which is the influence of the vocabulary in the chosen algorithm?
  • Features extraction (used in detection and recognition steps)

    • Goal?
    • Associated challenges?
    • List of potential features and associated step in the processing chain
    • Which techniques can be used to extract them?
    • Main problems that can occur?
    • How to choose the features?
  • 2D vs. 3D

    • Differences between gesture recognition in 3D and 2D?
    • Pros & cons of each approach


V. Athitsos and S. Sclaroff. "Estimating 3D hand pose from a cluttered image." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 2003.

Y. Azoz, L. Devi, R. Sharma, "Reliable Tracking of Human Arm Dynamics by Multiple Cue Integration and Constraint Fusion," Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, pp. 905, 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'98), 1998.

T. Baudel and M. Baudouin-Lafon, “Charade: Remote Control of Objects Using Free-Hand Gestures,” Comm. ACM, vol. 36, no. 7, pp. 28-35, 1993.

Bolt, R. A. 1980. “Put-that-there”: Voice and gesture at the graphics interface. In Proceedings of the 7th Annual Conference on Computer Graphics and interactive Techniques (Seattle, Washington, United States, July 14 - 18, 1980). SIGGRAPH '80. ACM, New York, NY, 262-270.

Cohen, P. R., Johnston, M., McGee, D., Oviatt, S., Pittman, J., Smith, I., Chen, L., and Clow, J. 1997. "QuickSet: multimodal interaction for distributed applications." In Proceedings of the Fifth ACM international Conference on Multimedia (Seattle, Washington, United States, November 09 - 13, 1997). MULTIMEDIA '97. ACM, New York, NY, 31-40.

J.L. Crowley, F. Berard, and J. Coutaz, “Finger Tracking As an Input Device for Augmented Reality,” Proc. Int’l Workshop on Automatic Face and Gesture Recognition, Zurich, Switzerland, pp. 195-200, June 1995.

J. Davis and M. Shah, “Recognizing Hand Gestures,” Proc. European Conf. Computer Vision, Stockholm, Sweden, pp. 331-340, 1994.

Deller, M.; Ebert, A.; Bender, M.; Hagen, H., "Flexible Gesture Recognition for Immersive Virtual Environments," Information Visualization, 2006. IV 2006. Tenth International Conference on , vol., no., pp.563-568, 5-7 July 2006.

W.T. Freeman and C.D. Weissman, “Television Control by Hand Gestures,” Proc. Int’l Workshop on Automatic Face and Gesture Recognition, Zurich, Switzerland, pp. 179-183, June 1995.

Fröhlich, M. & I. Wachsmuth (1998): "Gesture recognition of the upper limbs - from signal to symbol." In I. Wachsmuth and M. Fröhlich (eds.): Gesture and Sign Language in Human-Computer Interaction (pp. 173-184). Berlin: Springer-Verlag (LNAI 1371).

M. Fukumoto, Y. Suenaga, and K. Mase, “Finger-Pointer”: Pointing Interface by Image Processing,” Computers and Graphics, vol. 18, no. 5, pp. 633-642, 1994.

Holzapfel, H., Nickel, K., and Stiefelhagen, R. 2004. "Implementation and evaluation of a constraint-based multimodal fusion system for speech and 3D pointing gestures." In Proceedings of the 6th international Conference on Multimodal interfaces (State College, PA, USA, October 13 - 15, 2004). ICMI '04. ACM, New York, NY, 175-18.

K. Ishibuchi, H. Takemura, and F. Kishino, “Real Time Hand Gesture Recognition Using 3D Prediction Model,” Proc. 1993 Int’l Conf. Systems, Man, and Cybernetics, Le Touquet, France, pp. 324- 328, Oct. 17-20, 1993.

Kehl, R.; Van Gool, L., "Real-time pointing gesture recognition for an immersive environment," Automatic Face and Gesture Recognition, 2004. Proceedings. Sixth IEEE International Conference on , vol., no., pp. 577-582, 17-19 May 2004.

R. Kjeldsen, J. Kender, "Toward the use of gesture in traditional user interfaces," Automatic Face and Gesture Recognition, IEEE International Conference on, pp. 151, Second IEEE International Conference on Automatic Face and Gesture Recognition (FG '96), 1996.

Koons, D. B., Sparrell, C. J., and Thorisson, K. R. 1993. "Integrating simultaneous input from speech, gaze, and hand gestures." In intelligent Multimedia interfaces, M. T. Maybury, Ed. American Association for Artificial Intelligence, Menlo Park, CA, 257-276. (ask Bruno Dumas in room B440 for the book)

Mistry, P., Maes, P., and Chang, L. 2009. "WUW - wear Ur world: a wearable gestural interface." In Proceedings of the 27th international Conference Extended Abstracts on Human Factors in Computing Systems (Boston, MA, USA, April 04 - 09, 2009). CHI EA '09. ACM, New York, NY, 4111-4116.

Thomas B. Moeslund and Lau Nørgaard, "A Brief Overview of Hand Gestures used in Wearable Human Computer Interfaces", Technical report: CVMT 03-02, ISSN: 1601-3646, Laboratory of Computer Vision and Media Technology, Aalborg University, Denmark. 2003.

Vladimir Pavlovic, Rajeev Sharma, Thomas S. Huang: "Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review." IEEE Trans. Pattern Anal. Mach. Intell. 19(7): 677-695 (1997).

Thomas Schlomer, Benjamin Poppinga, Niels Henze, Susanne Boll, "Gesture Recognition with a Wii Controller", Proceedings of the 2nd international Conference on Tangible and Embedded interaction, 2008.

T. Starner and A. Pentland, “Visual Recognition of American Sign Language using Hidden Markov Models,” Proceedings of the International Workshop on Automatic Face and Gesture Recognition, Zurich, Switzerland, 1995.

Sushmita Mitra, Tinku Acharya, “Gesture Recognition: A Survey”, IEEE Transactions on systems, man, and cybernetics - part c: applications and reviews, vol. 37, no. 3, May 2007.

Vladimir Vezhnevets Vassili, Vassili Sazonov, Alla Andreeva, A Survey on Pixel-Based Skin Color Detection Techniques, in Proc. Graphicon-2003, pages 85-92.

Chan Wah Ng and Surendra Ranganath, "Real-time gesture recognition system and application", Image and Vision Computing, Volume 20, Issues 13-14, 1 December 2002, pages 993-1007, 2002.

Wei Du, Hua Li, "Vision based gesture recognition system with single camera", 5th International Conference on Signal Processing Proceedings, 2000.

A.D. Wilson and A.F. Bobick, “Recovering the Temporal Structure of Natural Gestures,” Proc. Int’l Conf. Automatic Face and Gesture Recognition, Killington, Vt., pp. 66-71, Oct. 1996.

Wren, C.R.; Azarbayejani, A.; Darrell, T.; Pentland, A.P., "Pfinder: real-time tracking of the human body," Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.19, no.7, pp.780-785, Jul 1997.

Ying Wu, Thomas S. Huang: Vision-Based Gesture Recognition: A Review. Book on Gesture-Based Communication in Human-Computer Interaction, Springer, 103-115 (1999).

Yang Liu, Yunde Jia, "A Robust Hand Tracking and Gesture Recognition Method for Wearable Visual Interfaces and Its Applications", Proceedings of the Third International Conference on Image and Graphics (ICIG’04), 2004.

Yepeng Guan; Mingen Zheng, "Real-time 3D pointing gesture recognition for natural HCI," Intelligent Control and Automation, 2008. WCICA 2008. 7th World Congress on , vol., no., pp.2433-2436, 25-27 June 2008.

Date: 2010, Spring