Even though many challenges remain unsolved, in recent years computer graphics algorithms to render photo-realistic imagery have seen tremendous progress. An important prerequisite for high-quality renderings is the availability of good models of the scenes to be rendered, namely models of shape, motion and appearance. Unfortunately, the technology to create such models has not kept pace with the technology to render the imagery. In fact, we observe a content creation bottleneck, as it often takes man months of tedious manual work by a animation artists to craft models of moving virtual scenes.
To overcome this limitation, the research community has been developing techniques to capture models of dynamic scenes from real world examples, for instance methods that rely on footage recorded with cameras or other sensors. One example are performance capture methods that measure detailed dynamic surface models, for example of actors or an actor's face, from multi-view video and without markers in the scene. Even though such 4D capture methods made big strides ahead, they are still at an early stage of their development. Their application is limited to scenes of moderate complexity in controlled environments, reconstructed detail is limited, and captured content cannot be easily modified, to name only a few restrictions.
In this talk, I will elaborate on some ideas on how to go beyond this limited scope of 4D reconstruction, and show some results from our recent work. For instance, I will show how we can capture more complex scenes with many objects or subjects in close interaction, as well as very challenging scenes of a smaller scale, such a hand motion. The talk will also show how we can capitalize on more sophisticated light transport models and inverse rendering to enable high-quality reconstruction in much more uncontrolled scenes, eventually also outdoors, and with very few cameras. I will also demonstrate how to represent captured scenes such that they can be conveniently modified. If time allows, the talk will cover some of our recent ideas on how to perform advanced edits of videos (e.g. removing or modifying dynamic objects in scenes) by exploiting reconstructed 4D models, as well as robustly found inter- and intra-frame correspondences.
Biography: Christian Theobalt is a Professor of Computer Science and the head of the research group "Graphics, Vision, & Video" at the Max-Planck-Institute for Informatics, Saarbruecken, Germany. From 2007 until 2009 he was a Visiting Assistant Professor in the Department of Computer Science at Stanford University. He received his MSc degree in Artificial Intelligence from the University of Edinburgh, Scotland, and his Diplom (MS) degree in Computer Science from Saarland University, in 2000 and 2001 respectively. In 2005, he received his PhD (Dr.-Ing.) from Saarland University and Max-Planck-Institute for Informatics.
Most of his research deals with algorithmic problems that lie on the boundary between the fields of Computer Vision and Computer Graphics, such as dynamic 3D scene reconstruction and marker-less motion capture, computer animation, appearance and reflectance modelling, machine learning for graphics and vision, new sensors for 3D acquisition, advanced video processing, as well as image- and physically-based rendering.
For his work, he received several awards, including the Otto Hahn Medal of the Max-Planck Society in 2007, the EUROGRAPHICS Young Researcher Award in 2009, and the German Pattern Recognition Award 2012. Further, in 2013 he was awarded an ERC Starting Grant by the European Union. He is a Principal Investigator and a member of the Steering Committee of the Intel Visual Computing Institute in Saarbruecken. He is also a co-founder of a spin-off company from his group - www.thecaptury.com - that is commercializing a new generation of marker-less motion and performance capture solutions.