Leonid Raskin, Ph.D. Thesis Seminar
Wednesday, 15.7.2009, 10:00
Tracking humans, understanding their actions and interpreting them are crucial to a great variety of applications. Tracking is used in automated surveillance, human-computer interface applications and in security applications. During the last decade extended research has been conducted on this subject. Analysis of human interactions is a complicated and challenging task for several reasons. First, the large number of body parts makes it hard to detect each part separately. Second, differences in things like clothing style and illumination conditions add to the already great variety of images. Finally, in order to achieve a satisfactory understanding of pose, one has to solve the ambiguity caused by body articulation.
The focus of our research is to provide a robust algorithm for 3D body part tracking and action classification. We will present the method, which is based on nonlinear dimensionality reduction of high dimensional data space to low dimensional latent space (GPLVM, GPDM, H-GPLVM). Human body motion is described by concatenation of low dimensional manifolds that characterize different motion types. We will introduce a body pose tracker, which is based on Annealed Particle filter. The tracker uses the learned mapping function from in order to achieve latent space to body pose space a robust and accurate tracking of human body parts. For the tracking we separate model state into two independent parts: one contains information about 3D location and orientation of the body and the second one describes the pose. We learn latent space that describes poses only. The tracking algorithm consists of two stages. Firstly the particles are generated in the latent space and are transformed into the data space by using learned a priori mapping function. Secondly we add rotation and translation parameters to obtain valid poses. The likelihood function calculated in order to evaluate how well a pose matches the visual data. The resulting tracker estimates the locations in the latent space that represents poses with the highest likelihood. These points in the latent space provide low dimensional representations of body pose sequences representing a specific action type and are used later to classify human actions.
The approach is illustrated on the HumanEvaI and HumanEvaII datasets, as well as on other datasets that include scenarios of interactions between people. Our tracker is shown to be robust when classifying individual actions and is also capable of the harder task of classifying interactions between people.