We collect 3D scans of human hands (left) from multiple people and model pose and shape variation across people and poses by learning a statistical model of the human hand, called MANO. We then combine the MANO hand model with our SMPL body model to build a holistic model called SMPL+H. The figure (right) shows example 3D scans (white) from our 4D sequences and corresponding fits of SMPL+H to these scans (pink). SMPL+H is able to capture natural motions even under challenging conditions, such as severe missing data due to fast motion, occlusion, finger-webbing, or noise.
Hands are important to humans for signaling and communication, as well as for interacting with the physical world. Capturing the motion of hands is a very challenging computer vision problem that is also highly relevant for other areas like computer graphics, human-computer interfaces, and robotics.
We focus on building an accurate and realistic model of the human hand [ ] that captures the pose and shape variation across a human population. For this we collect many examples of human hands with our 3D scanner, following a systematic grasp taxonomy [ ]. We then combine the hand model with our SMPL body model to build a seamless model of the body together with hands, called SMPL+H. This allows us to naturally capture the motion of people with expressive body and hand motion using our 4D scanner.
A strong hand model can be used to regularize fitting to noisy input data to reconstruct hands and/or objects [ ]. We focus on hands that interact with other hands or known objects [ ], using either a single RGB-D camera or multiple synchronized RGB cameras. Interaction cues can also reveal information that helps to reconstruct unknown properties of the object, like the kinematic skeleton [ ].
Our current work focuses on estimating hands performing tasks from a single image or video. We use our hand model to generate synthetic training data of hand-object interaction and use deep learning to reconstruct hand-object configurations jointly from a single RGB image. By estimating the 3D hand and object shape together, we are able to reason about interactions such as proximity, contact, grasp stability, and forces while preventing interpenetration.
In collaboration with the Haptic Intelligence department we are extending our hand capture and modeling to account for the soft tissue deformation of the hand during contact and manipulation [ ]. This is critical for realistic physical reasoning about grasp.
ObMan (CVPR 2019) dataset (link)
Synthetic dataset with
- RGB rendered images
- full annotated ground truth (meshes, model parameters, etc) for
- hand &
SIGGRAPH-Asia 2017 (TOG) models/dataset (link)
Models & Alignments & Scans, for:
- hand-only (MANO)
- body+hand (SMPL+H)
ECCVw 2016 dataset (link)
RGB-D dataset of an object under manipulation.
The dataset also contains input 3D template meshes
for each object and output articulated models.
IJCV 2016 dataset (link)
Annotated RGB-D + multicamera-RGB dataset of one or two hands
interacting with each other and/or with a rigid or an articulated object
ICCV 2015 dataset (link)
RGB-D dataset of a hand rotating a rigid object for 3d scanning
GCPR 2014 dataset (link)
Annotated RGB-D dataset of one or two hands interacting with each other
GCPR 2013 dataset (link)
Synthetic dataset of two hands interacting with each other