I am a Ph.D. student in the Department of Perceiving Systems at the Max Planck Institute for Intelligent Systems, advised by Professor Michael J. Black and Dr. Dimitrios Tzionas. I am interested in modeling and capturing the human body and hands motion with a focus on Human-Object Interaction (HOI). More specifically, my research is focused on precise body and hands motion estimation and generation in order to interact, grasp, and use new 3D objects. I am also interested in precise body mocap using multimodal sensors (IMUs, Cameras, Touch and Flex Sensors, etc) to be able to capture accurate interactions and feedback from the environment.
Hands are important to humans for signaling and communication, as well as for interacting with the physical world. Capturing the motion of hands is a very challenging computer vision problem that is also highly relevant for other areas like computer graphics, human-computer interfaces, and robotics.
In Computer Vision – ECCV 2020, Springer International Publishing, Cham, August 2020 (inproceedings)
Training computers to understand, model, and synthesize human grasping requires a rich dataset containing complex 3D object shapes, detailed contact information, hand pose and shape, and the 3D body motion over time. While "grasping" is commonly thought of as a single hand stably lifting an object, we capture the motion of the entire body and adopt the generalized notion of "whole-body grasps". Thus, we collect a new dataset, called GRAB (GRasping Actions with Bodies), of whole-body grasps, containing full 3D shape and pose sequences of 10 subjects interacting with 51 everyday objects of varying shape and size. Given MoCap markers, we fit the full 3D body shape and pose, including the articulated face and hands, as well as the 3D object pose. This gives detailed 3D meshes over time, from which we compute contact between the body and object. This is a unique dataset, that goes well beyond existing ones for modeling and understanding how humans grasp and manipulate objects, how their full body is involved, and how interaction varies with the task. We illustrate the practical value of GRAB with an example application; we train GrabNet, a conditional generative network, to predict 3D hand grasps for unseen 3D object shapes. The dataset and code are available for research purposes at https://grab.is.tue.mpg.de.
Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems