Conference Proceedings


Richard Reilly



person identification classifier combination lip modality speaker identification multi modal fusion audio visual hidden markov models

Robust multi-modal person Identification with tolerance of facial expression (2004)

Abstract The research presented in this work describes audio-visual speaker identification experiments carried out on a large data set of 251 subjects. Both the audio and visual modeling is carried out using hidden Markov models. The visual modality uses the speaker's lip information. The audio and visual modalities are both degraded to emulate a train/test mismatch. The fusion method employed adapts automatically by using classifier score reliability estimates of both modalities to give improved audio-visual accuracies at all tested levels of audio and visual degradation, compared to the individual audio or visual modality accuracies. A maximum visual identification accuracy of 86% was achieved. This result is comparable to the performance of systems using the entire face, and suggests the hypothesis that the system described would be tolerant to varying facial expression, since only the information around the speaker's lips is employed.
Collections Ireland -> Trinity College Dublin -> Administrative Staff Authors (Scholarly Publications)
Ireland -> Trinity College Dublin -> Administrative Staff Authors

Full list of authors on original publication

Richard Reilly

Experts in our system

Richard Reilly
Trinity College Dublin
Total Publications: 185