Augmented iconicity — Visualizing and analyzing gestures with motion-capture technology
In multimodal interaction research, motion-capture technology (MoCap) has proven to be a powerful tool to investigate furtive gestures and head movements in minute detail. Precise 3-D numerical data corresponding to finger/hand configurations and movements are tracked on a millimeter and millisecond scale. Visualizing otherwise invisible motion traces allows for new insights into the dynamic gestalt properties of communicative movements. The kinetic data streams further lend themselves to statistical analyses and pattern recognition, which opens up new avenues for gesture and sign language research.
This talk presents MoCap studies recently carried out in the Natural Media Lab (HumTec, RWTH Aachen). One focus is on qualitative analyses of iconic gestures whose iconicity was augmented through visualizing and freezing their motion trajectories. Affording graphic information, MoCap plots of gestures describing paintings revealed additional qualities, e.g., regarding underlying image schemas (Mittelberg 2013, 2014). MoCap was further used to visualize gestural diagrams representing travel itineraries dialogue partners drew into the air (Schüller & Mittelberg in press).
The second focus concerns quantitative approaches that involve normalizing numeric gesture data to aggregate movements across subjects. This allowed us to visualize the speakers’ individual use of gesture space via heat maps (Priesters & Mittelberg 2013) or to find patterns in head gestures across speakers (Brenger & Mittelberg 2015). Gesture signatures of selected movement types further served as input for an algorithm searching the data set for similar trajectories, which afforded automated retrieval of all the tokens of a given gesture type (Beecks et al. 2016; Schüller et al. 2017).
Christian WOLF is associate professor (Maitre de Conférences, HDR) at INSA de Lyon and LIRIS UMR 5205, a CNRS laboratory, since 2005. Since 2017 he is on leave with INRIA (Chroma work group) and CITI. He is interested in computer vision and machine learning, deep learning, especially in the visual analysis of complex scenes in motion: gesture and activity recognition and pose estimation. In his work he puts an emphasis on models of complex interactions, on structured models, graphical models and on deep learning. He received his MSc in computer science from Vienna University of Technology (TU Wien) in 2000, and a PhD in computer science from INSA de Lyon, France, in 2003. In 2012 he obtained the habilitation diploma, also from INSA de Lyon.
Learning human motion: gestures, activities, pose, identity
This talk is devoted to (deep) learning methods advancing automatic analysis and interpreting of human motion from different perspectives and based on various sources of information, such as images, video, depth, mocap data, audio and inertial sensors. We propose several models and associated training algorithms for supervised classification and semi-supervised and weakly-supervised feature learning, as well as modelling of temporal dependencies, and show their efficiency on a set of fundamental tasks, including detection, classification, parameter estimation and user verification.
Advances in several applications will be shown, including (i) gesture spotting and recognition based on multi-scale and multi-modal deep learning from visual signals; (ii) human activity recognition using models of visual attention; (iii) hand pose estimation through deep regression from depth images, based on semi-supervised and weakly-supervised learning; (iv) mobile biometrics, in particular the automatic authentification of smartphone users through learning from data acquired from inertiel sensors.