Latent-Dynamic Discriminative Models for Continuous Gesture Recognition
Many problems in vision involve the prediction of a class label for each frame in an unsegmented sequence. In this paper we develop a discriminative framework for simultaneous sequence segmentation and labeling which can capture both intrinsic and extrinsic class dynamics. Our approach incorporates hidden state variables which model the sub-structure of a class sequence and learn the dynamics between class labels. Each class label has a disjoint set of associated hidden states, which enables efficient training and inference in our model. We evaluated our method on the task of recognizing human gestures from unsegmented video streams and performed experiments on three different datasets of head and eye gestures. Our results demonstrate that our model for visual gesture recognition outperform models based on Support Vector Machines, Hidden Markov Models, and Conditional Random Fields.