Department Talks

SDF-2-SDF: 3D Reconstruction of Rigid and Deformable Objects from RGB-D Videos

Talk
  • 19 October 2017 • 10:00 11:00
  • Slobodan Ilic and Mira Slavcheva
  • PS Seminar Room (N3.022)

In this talk we will address the problem of 3D reconstruction of rigid and deformable objects from a single depth video stream. Traditional 3D registration techniques, such as ICP and its variants, are wide-spread and effective, but sensitive to initialization and noise due to the underlying correspondence estimation procedure. Therefore, we have developed SDF-2-SDF, a dense, correspondence-free method which aligns a pair of implicit representations of scene geometry, e.g. signed distance fields, by minimizing their direct voxel-wise difference. In its rigid variant, we apply it for static object reconstruction via real-time frame-to-frame camera tracking and posterior multiview pose optimization, achieving higher accuracy and a wider convergence basin than ICP variants. Its extension to scene reconstruction, SDF-TAR, carries out the implicit-to-implicit registration over several limited-extent volumes anchored in the scene and runs simultaneous GPU tracking and CPU refinement, with a lower memory footprint than other SLAM systems. Finally, to handle non-rigidly moving objects, we incorporate the SDF-2-SDF energy in a variational framework, regularized by a damped approximately Killing vector field. The resulting system, KillingFusion, is able to reconstruct objects undergoing topological changes and fast inter-frame motion in near-real time.

Organizers: Fatma Güney

3D lidar mapping: an accurate and performant approach

Talk
  • 20 October 2017 • 11:30 12:30
  • Michiel Vlaminck
  • PS Seminar Room (N3.022)

In my talk I will present my work regarding 3D mapping using lidar scanners. I will give an overview of the SLAM problem and its main challenges: robustness, accuracy and processing speed. Regarding robustness and accuracy, we investigate a better point cloud representation based on resampling and surface reconstruction. Moreover, we demonstrate how it can be incorporated in an ICP-based scan matching technique. Finally, we elaborate on globally consistent mapping using loop closures. Regarding processing speed, we propose the integration of our scan matching in a multi-resolution scheme and a GPU-accelerated implementation using our programming language Quasar.

Organizers: Simon Donne

Multi-View Perception of Dynamic Scenes

IS Colloquium
  • 20 March 2014 • 11:15:00 12:30
  • Edmond Boyer
  • Max Planck House Lecture Hall

The INRIA MORPHEO research team is working on the perception of moving shapes using multiple camera systems. Such systems allows to recover dense information on shapes and their motions using visual cues. This opens avenues for research investigations on how to model, understand and animate real dynamic shapes using several videos. In this talk I will more particularly focus on recent activities in the team on two fundamental components of the multi-view perception of dynamic scenes that are: (i) the recovery of time-consistent shape models or shape tracking and (ii) the segmentation of objects in multiple views and over time. 
 

Organizers: Gerard Pons-Moll


  • Prof. Yoshinari Kameda
  • MRC seminar room (0.A.03)

This talk presents our 3D video production method by which a user can watch a  real game from any free viewpoint. Players in the game are captured by 10 cameras and they are reproduced three dimensionally by billboard based representation in real time. Upon producing the 3D video, we have also worked on good user interface that can enable people move the camera intuitively. As the speaker is also working on wide variety of computer vision to augmented reality, selected recent works will be also introduced briefly.

Dr. Yoshinari Kameda started his research from human pose estimation as his Ph.D thesis, then he expands his interested topics from computer vision, human interface, and augmented reality.
He is now an associate professor at University of Tsukuba.
He is also a member of Center for Computational Science of U-Tsukuba where some outstanding super-computer s are in operation.
He served International Symposium on Mixed and Augmented Reality as a area chair for four years (2007-2010).


  • Christof Hoppe
  • MRC Seminar Room

3D reconstruction from 2D still-images (Structure-from-Motion) has reached maturity and together with new image acquisition devices like Micro Aerial Vehicles (MAV), new interesting application scenarios arise. However, acquiring an image set which is suited for a complete and accurate reconstruction is even for expert users a non-trivial task. To overcome this problem, we propose two different methods. In the first part of the talk, we will present a SfM method that performs sparse reconstruction of 10Mpx still-images and a surface extraction from sparse and noisy 3D point clouds in real-time. We therefore developed a novel efficient image localisation method and a robust surface extraction that works in a fully incremental manner directly on sparse 3D points without a densification step. The real-time feedback of the reconstruction quality the enables the user to control the acquisition process interactively. In the second part, we will present ongoing work of a novel view planning method that is designed to deliver a set of images that can be processed by today's multi-view reconstruction pipelines.


  • Bernt Schiele
  • Max Planck House Lecture Hall

This talk will highlight recent progress on two fronts. First, we will talk about a novel image-conditioned person model that allows for effective articulated pose estimation in realistic scenarios. Second, we describe our work towards activity recognition and the ability to describe video content with natural language. 

Both efforts are part of a longer-term agenda towards visual scene understanding. While visual scene understanding has long been advocated as the "holy grail" of computer vision, we believe it is time to address this challenge again,  based on the progress in recent years.


  • Pascal Fua
  • Max Planck House Lecture Hall

In this talk, I will show that, given probabilities of presence of people at various locations in individual time frames, finding the most likely set of trajectories amounts to solving a linear program that depends on very few parameters.
This can be done without requiring appearance information and in real-time, by using the K-Shortest Paths algorithm (KSP). However, this can result in unwarranted identity switches in complex scenes. In such cases, sparse image information can be used within the Linear Programming framework to keep track of people's identities, even when their paths come close to each other or intersect. By sparse, we mean that the appearance needs only be discriminative in a very limited number of frames, which makes our approach widely applicable.


  • Alessandra Tosi
  • Max Planck Haus Lecture Hall

Manifold learning techniques attempt to map a high-dimensional space onto a lower-dimensional one. From a mathematical point of view, a manifold is a topological Hausdorff space that is locally Euclidean. From Machine Learning point of view, we can interpret this embedded manifold as the underlying support of the data distribution. When dealing with high dimensional data sets, nonlinear dimensionality reduction methods can provide more faithful data representation than linear ones. However, the local geometrical distortion induced by the nonlinear mapping leads to a loss of information and affects interpretability, with a negative impact in the model visualization results.
This talk will discuss an approach which involves probabilistic nonlinear dimensionality reduction through Gaussian Process Latent Variables Models. The main focus is on the intrinsic geometry of the model itself as a tool to improve the exploration of the latent space and to recover information loss due to dimensionality reduction. We aim to analytically quantify and visualize the distortion due to dimensionality reduction in order to improve the performance of the model and to interpret data in a more faithful way.

In collaboration with: N.D. Lawrence (University of Sheffield), A. Vellido (UPC)


Perceptual Grouping using Superpixels

Talk
  • 11 November 2013 • 02:00:00
  • Sven Dickinson
  • MPH Lecture Hall

Perceptual grouping played a prominent role in support of early object recognition systems, which typically took an input image and a database of shape models and identified which of the models was visible in the image.  When the database was large, local features were not sufficiently distinctive to prune down the space of models to a manageable number that could be verified.  However, when causally related shape features were grouped, using intermediate-level shape priors, e.g., cotermination, symmetry, and compactness, they formed effective shape indices and allowed databases to grow in size.  In recent years, the recognition (categorization) community has focused on the object detection problem, in which the input image is searched for a specific target object.  Since indexing is not required to select the target model, perceptual grouping is not required to construct a discriminative shape index; the existence of a much stronger object-level shape prior precludes the need for a weaker intermediate-level shape prior.  As a result, perceptual grouping activity at our major conferences has diminished. However, there are clear signs that the recognition community is moving from appearance back to shape, and from detection back to unexpected object recognition. Shape-based perceptual grouping will play a critical role in facilitating this transition.  But while causally related features must be grouped, they also need to be abstracted before they can be matched to categorical models.   In this talk, I will describe our recent progress on the use of intermediate shape priors in segmenting, grouping, and abstracting shape features. Specifically, I will describe the use of symmetry and non-accidental attachment to detect and group symmetric parts, the use of closure to separate figure from background, and the use of a vocabulary of simple shape models to group and abstract image contours.


  • Padmanabhan Anandan
  • MPH Lecture Hall

T.b.a.


Exploring and editing the appearance of outdoor scenes

Talk
  • 11 October 2013 • 09:30:00
  • Pierre-Yves Laffont
  • MRZ seminar

The appearance of outdoor scenes changes dramatically with lighting and weather conditions, time of day, and season. Specific conditions, such as the "golden hours" characterized by warm light, can be hard to capture because many scene properties are transient -- they change over time. Despite significant advances in image editing software, common image manipulation tasks such as lighting editing require significant expertise to achieve plausible results.
 
In this talk, we first explore the appearance of outdoor scenes with an approach based on crowdsourcing and machine learning. We relate visual changes to scene attributes, which are human-nameable concepts used for high-level description of scenes. We collect a dataset containing thousands of outdoor images, annotate them with transient attributes, and train classifiers to recognize these properties in new images. We develop new interfaces for browsing photo collections, based on these attributes.
 
We then focus on specifically extracting and manipulating the lighting in a photograph. Intrinsic image decomposition separates a photograph into independent layers: reflectance, which represents the color of the materials, and illumination, which encodes the effect of lighting at each pixel. We tackle this ill-posed problem by leveraging additional information provided by multiple photographs of the scene. The methods we describe enable advanced image manipulations such as lighting-aware editing, insertion of virtual objects, and image-based illumination transfer between photographs of a collection.
 


Inference in highly-connected CRFs

Talk
  • 01 October 2013 • 08:00:00
  • Neill Campbell
  • MPH lecture hall

This talk presents recent work from CVPR that looks at inference for pairwise CRF models in the highly (or fully) connected case rather than simply a sparse set of neighbours used ubiquitously in many computer vision tasks. Recent work has shown that fully-connected CRFs, where each node is connected to every other node, can be solved very efficiently under the restriction that the pairwise term is a Gaussian kernel over a Euclidean feature space. The method presented generalises this model to allow arbitrary, non-parametric models (which can be learnt from training data and conditioned on test data) to be used for the pairwise potentials. This greatly increases the expressive power of such models whilst maintaining efficient inference.