Smile and laughter recognition using speech processing and face recognition from conversation video

Akinori Ito, XinyueWang, Motoyuki Suzuki, Shozo Makino

Research output: Chapter in Book/Report/Conference proceedingConference contribution

42 Citations (Scopus)

Abstract

This paper describes a method to detect smiles and laughter sounds from the video of natural dialogue. A smile is the most common facial expression observed in a dialogue. Detecting a user's smiles and laughter sounds can be useful for estimating the mental state of the user of a spoken-dialogue-based user interface. In addition, detecting laughter sound can be utilized to prevent the speech recognizer from wrongly recognizing the laughter sound as meaningful words. In this paper, a method to detect smile expression and laughter sound robustly by combining an image-based facial expression recognition method and an audio-based laughter sound recognition method. The image-based method uses a feature vector based on feature point detection from face images. The method could detect smile faces by more than 80% recall and precision rate. A method to combine a GMM-based laughter sound recognizer and the image-based method could improve the accuracy of detection of laughter sounds compared with methods that use image or sound only. As a result, more than 70% recall and precision rate of laughter sound detection was obtained from the natural conversation videos.

Original languageEnglish
Title of host publicationProceedings - 2005 International Conference on Cyberworlds, CW 2005
Pages437-444
Number of pages8
DOIs
Publication statusPublished - 2005 Dec 1
Event2005 International Conference on Cyberworlds, CW 2005 - Singapore, Singapore
Duration: 2005 Nov 232005 Nov 25

Publication series

NameProceedings - 2005 International Conference on Cyberworlds, CW 2005
Volume2005

Other

Other2005 International Conference on Cyberworlds, CW 2005
CountrySingapore
CitySingapore
Period05/11/2305/11/25

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Smile and laughter recognition using speech processing and face recognition from conversation video'. Together they form a unique fingerprint.

Cite this