Header logo is ps


2015


Thumb xl teaser
Permutohedral Lattice CNNs

Kiefel, M., Jampani, V., Gehler, P. V.

In ICLR Workshop Track, May 2015 (inproceedings)

Abstract
This paper presents a convolutional layer that is able to process sparse input features. As an example, for image recognition problems this allows an efficient filtering of signals that do not lie on a dense grid (like pixel position), but of more general features (such as color values). The presented algorithm makes use of the permutohedral lattice data structure. The permutohedral lattice was introduced to efficiently implement a bilateral filter, a commonly used image processing operation. Its use allows for a generalization of the convolution type found in current (spatial) convolutional network architectures.

pdf link (url) [BibTex]

2015

pdf link (url) [BibTex]


Thumb xl jampani15aistats teaser
Consensus Message Passing for Layered Graphical Models

Jampani, V., Eslami, S. M. A., Tarlow, D., Kohli, P., Winn, J.

In Eighteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 38, pages: 425-433, JMLR Workshop and Conference Proceedings, May 2015 (inproceedings)

Abstract
Generative models provide a powerful framework for probabilistic reasoning. However, in many domains their use has been hampered by the practical difficulties of inference. This is particularly the case in computer vision, where models of the imaging process tend to be large, loopy and layered. For this reason bottom-up conditional models have traditionally dominated in such domains. We find that widely-used, general-purpose message passing inference algorithms such as Expectation Propagation (EP) and Variational Message Passing (VMP) fail on the simplest of vision models. With these models in mind, we introduce a modification to message passing that learns to exploit their layered structure by passing 'consensus' messages that guide inference towards good solutions. Experiments on a variety of problems show that the proposed technique leads to significantly more accurate inference results, not only when compared to standard EP and VMP, but also when compared to competitive bottom-up conditional models.

online pdf supplementary link (url) [BibTex]

online pdf supplementary link (url) [BibTex]


Thumb xl silvia phd
Shape Models of the Human Body for Distributed Inference

Zuffi, S.

Brown University, May 2015 (phdthesis)

Abstract
In this thesis we address the problem of building shape models of the human body, in 2D and 3D, which are realistic and efficient to use. We focus our efforts on the human body, which is highly articulated and has interesting shape variations, but the approaches we present here can be applied to generic deformable and articulated objects. To address efficiency, we constrain our models to be part-based and have a tree-structured representation with pairwise relationships between connected parts. This allows the application of methods for distributed inference based on message passing. To address realism, we exploit recent advances in computer graphics that represent the human body with statistical shape models learned from 3D scans. We introduce two articulated body models, a 2D model, named Deformable Structures (DS), which is a contour-based model parameterized for 2D pose and projected shape, and a 3D model, named Stitchable Puppet (SP), which is a mesh-based model parameterized for 3D pose, pose-dependent deformations and intrinsic body shape. We have successfully applied the models to interesting and challenging problems in computer vision and computer graphics, namely pose estimation from static images, pose estimation from video sequences, pose and shape estimation from 3D scan data. This advances the state of the art in human pose and shape estimation and suggests that carefully de ned realistic models can be important for computer vision. More work at the intersection of vision and graphics is thus encouraged.

PDF [BibTex]


Thumb xl screen shot 2015 10 14 at 08.57.57
Multi-view and 3D Deformable Part Models

Pepik, B., Stark, M., Gehler, P., Schiele, B.

Pattern Analysis and Machine Intelligence, 37(11):14, IEEE, March 2015 (article)

Abstract
As objects are inherently 3-dimensional, they have been modeled in 3D in the early days of computer vision. Due to the ambiguities arising from mapping 2D features to 3D models, 3D object representations have been neglected and 2D feature-based models are the predominant paradigm in object detection nowadays. While such models have achieved outstanding bounding box detection performance, they come with limited expressiveness, as they are clearly limited in their capability of reasoning about 3D shape or viewpoints. In this work, we bring the worlds of 3D and 2D object representations closer, by building an object detector which leverages the expressive power of 3D object representations while at the same time can be robustly matched to image evidence. To that end, we gradually extend the successful deformable part model [1] to include viewpoint information and part-level 3D geometry information, resulting in several different models with different level of expressiveness. We end up with a 3D object model, consisting of multiple object parts represented in 3D and a continuous appearance model. We experimentally verify that our models, while providing richer object hypotheses than the 2D object models, provide consistently better joint object localization and viewpoint estimation than the state-of-the-art multi-view and 3D object detectors on various benchmarks (KITTI [2], 3D object classes [3], Pascal3D+ [4], Pascal VOC 2007 [5], EPFL multi-view cars [6]).

DOI Project Page [BibTex]

DOI Project Page [BibTex]


Thumb xl th teaser
From Scans to Models: Registration of 3D Human Shapes Exploiting Texture Information

Bogo, F.

University of Padova, March 2015 (phdthesis)

Abstract
New scanning technologies are increasing the importance of 3D mesh data, and of algorithms that can reliably register meshes obtained from multiple scans. Surface registration is important e.g. for building full 3D models from partial scans, identifying and tracking objects in a 3D scene, creating statistical shape models. Human body registration is particularly important for many applications, ranging from biomedicine and robotics to the production of movies and video games; but obtaining accurate and reliable registrations is challenging, given the articulated, non-rigidly deformable structure of the human body. In this thesis, we tackle the problem of 3D human body registration. We start by analyzing the current state of the art, and find that: a) most registration techniques rely only on geometric information, which is ambiguous on flat surface areas; b) there is a lack of adequate datasets and benchmarks in the field. We address both issues. Our contribution is threefold. First, we present a model-based registration technique for human meshes that combines geometry and surface texture information to provide highly accurate mesh-to-mesh correspondences. Our approach estimates scene lighting and surface albedo, and uses the albedo to construct a high-resolution textured 3D body model that is brought into registration with multi-camera image data using a robust matching term. Second, by leveraging our technique, we present FAUST (Fine Alignment Using Scan Texture), a novel dataset collecting 300 high-resolution scans of 10 people in a wide range of poses. FAUST is the first dataset providing both real scans and automatically computed, reliable "ground-truth" correspondences between them. Third, we explore possible uses of our approach in dermatology. By combining our registration technique with a melanocytic lesion segmentation algorithm, we propose a system that automatically detects new or evolving lesions over almost the entire body surface, thus helping dermatologists identify potential melanomas. We conclude this thesis investigating the benefits of using texture information to establish frame-to-frame correspondences in dynamic monocular sequences captured with consumer depth cameras. We outline a novel approach to reconstruct realistic body shape and appearance models from dynamic human performances, and show preliminary results on challenging sequences captured with a Kinect.

[BibTex]


Thumb xl screenshot area 2015 07 27 010243
Active Learning for Abstract Models of Collectives

Schiendorfer, A., Lassner, C., Anders, G., Reif, W., Lienhart, R.

In 3rd Workshop on Self-optimisation in Organic and Autonomic Computing Systems (SAOS), March 2015 (inproceedings)

Abstract
Organizational structures such as hierarchies provide an effective means to deal with the increasing complexity found in large-scale energy systems. In hierarchical systems, the concrete functions describing the subsystems can be replaced by abstract piecewise linear functions to speed up the optimization process. However, if the data points are weakly informative the resulting abstracted optimization problem introduces severe errors and exhibits bad runtime performance. Furthermore, obtaining additional point labels amounts to solving computationally hard optimization problems. Therefore, we propose to apply methods from active learning to search for informative inputs. We present first results experimenting with Decision Forests and Gaussian Processes that motivate further research. Using points selected by Decision Forests, we could reduce the average mean-squared error of the abstract piecewise linear function by one third.

code (hosted on github) pdf [BibTex]

code (hosted on github) pdf [BibTex]


Thumb xl thesis teaser
Long Range Motion Estimation and Applications

Sevilla-Lara, L.

Long Range Motion Estimation and Applications, University of Massachusetts Amherst, University of Massachusetts Amherst, Febuary 2015 (phdthesis)

Abstract
Finding correspondences between images underlies many computer vision problems, such as optical flow, tracking, stereovision and alignment. Finding these correspondences involves formulating a matching function and optimizing it. This optimization process is often gradient descent, which avoids exhaustive search, but relies on the assumption of being in the basin of attraction of the right local minimum. This is often the case when the displacement is small, and current methods obtain very accurate results for small motions. However, when the motion is large and the matching function is bumpy this assumption is less likely to be true. One traditional way of avoiding this abruptness is to smooth the matching function spatially by blurring the images. As the displacement becomes larger, the amount of blur required to smooth the matching function becomes also larger. This averaging of pixels leads to a loss of detail in the image. Therefore, there is a trade-off between the size of the objects that can be tracked and the displacement that can be captured. In this thesis we address the basic problem of increasing the size of the basin of attraction in a matching function. We use an image descriptor called distribution fields (DFs). By blurring the images in DF space instead of in pixel space, we in- crease the size of the basin attraction with respect to traditional methods. We show competitive results using DFs both in object tracking and optical flow. Finally we demonstrate an application of capturing large motions for temporal video stitching.

[BibTex]

[BibTex]


Thumb xl ssimssmall
Spike train SIMilarity Space (SSIMS): A framework for single neuron and ensemble data analysis

Vargas-Irwin, C. E., Brandman, D. M., Zimmermann, J. B., Donoghue, J. P., Black, M. J.

Neural Computation, 27(1):1-31, MIT Press, January 2015 (article)

Abstract
We present a method to evaluate the relative similarity of neural spiking patterns by combining spike train distance metrics with dimensionality reduction. Spike train distance metrics provide an estimate of similarity between activity patterns at multiple temporal resolutions. Vectors of pair-wise distances are used to represent the intrinsic relationships between multiple activity patterns at the level of single units or neuronal ensembles. Dimensionality reduction is then used to project the data into concise representations suitable for clustering analysis as well as exploratory visualization. Algorithm performance and robustness are evaluated using multielectrode ensemble activity data recorded in behaving primates. We demonstrate how Spike train SIMilarity Space (SSIMS) analysis captures the relationship between goal directions for an 8-directional reaching task and successfully segregates grasp types in a 3D grasping task in the absence of kinematic information. The algorithm enables exploration of virtually any type of neural spiking (time series) data, providing similarity-based clustering of neural activity states with minimal assumptions about potential information encoding models.

pdf: publisher site pdf: author's proof DOI Project Page [BibTex]

pdf: publisher site pdf: author's proof DOI Project Page [BibTex]


Thumb xl untitled
Efficient Facade Segmentation using Auto-Context

Jampani, V., Gadde, R., Gehler, P. V.

In Applications of Computer Vision (WACV), 2015 IEEE Winter Conference on, pages: 1038-1045, IEEE, January 2015 (inproceedings)

Abstract
In this paper we propose a system for the problem of facade segmentation. Building facades are highly structured images and consequently most methods that have been proposed for this problem, aim to make use of this strong prior information. We are describing a system that is almost domain independent and consists of standard segmentation methods. A sequence of boosted decision trees is stacked using auto-context features and learned using the stacked generalization technique. We find that this, albeit standard, technique performs better, or equals, all previous published empirical results on all available facade benchmark datasets. The proposed method is simple to implement, easy to extend, and very efficient at test time inference.

website pdf supplementary IEEE page link (url) DOI Project Page [BibTex]

website pdf supplementary IEEE page link (url) DOI Project Page [BibTex]


Thumb xl screenshot area 2015 07 27 004943
Norm-induced entropies for decision forests

Lassner, C., Lienhart, R.

IEEE Winter Conference on Applications of Computer Vision (WACV), January 2015 (conference)

Abstract
The entropy measurement function is a central element of decision forest induction. The Shannon entropy and other generalized entropies such as the Renyi and Tsallis entropy are designed to fulfill the Khinchin-Shannon axioms. Whereas these axioms are appropriate for physical systems, they do not necessarily model well the artificial system of decision forest induction. In this paper, we show that when omitting two of the four axioms, every norm induces an entropy function. The remaining two axioms are sufficient to describe the requirements for an entropy function in the decision forest context. Furthermore, we introduce and analyze the p-norm-induced entropy, show relations to existing entropies and the relation to various heuristics that are commonly used for decision forest training. In experiments with classification, regression and the recently introduced Hough forests, we show how the discrete and differential form of the new entropy can be used for forest induction and how the functions can simply be fine-tuned. The experiments indicate that the impact of the entropy function is limited, however can be a simple and useful post-processing step for optimizing decision forests for high performance applications.

pdf code [BibTex]

pdf code [BibTex]


Thumb xl lrmmbotperson withmbot
Dataset Suite for Benchmarking Perception in Robotics

Ahmad, A., Lima, P.

In International Conference on Intelligent Robots and Systems (IROS) 2015, 2015 (inproceedings)

[BibTex]

[BibTex]


Thumb xl flowcap im
FlowCap: 2D Human Pose from Optical Flow

Romero, J., Loper, M., Black, M. J.

In Pattern Recognition, Proc. 37th German Conference on Pattern Recognition (GCPR), LNCS 9358, pages: 412-423, Springer, 2015 (inproceedings)

Abstract
We estimate 2D human pose from video using only optical flow. The key insight is that dense optical flow can provide information about 2D body pose. Like range data, flow is largely invariant to appearance but unlike depth it can be directly computed from monocular video. We demonstrate that body parts can be detected from dense flow using the same random forest approach used by the Microsoft Kinect. Unlike range data, however, when people stop moving, there is no optical flow and they effectively disappear. To address this, our FlowCap method uses a Kalman filter to propagate body part positions and ve- locities over time and a regression method to predict 2D body pose from part centers. No range sensor is required and FlowCap estimates 2D human pose from monocular video sources containing human motion. Such sources include hand-held phone cameras and archival television video. We demonstrate 2D body pose estimation in a range of scenarios and show that the method works with real-time optical flow. The results suggest that optical flow shares invariances with range data that, when complemented with tracking, make it valuable for pose estimation.

video pdf preprint Project Page Project Page [BibTex]

video pdf preprint Project Page Project Page [BibTex]


Thumb xl mbot
Towards Optimal Robot Navigation in Urban Homes

Ventura, R., Ahmad, A.

In RoboCup 2014: Robot World Cup XVIII, pages: 318-331, Lecture Notes in Computer Science ; 8992, Springer, Cham, Switzerland, 2015 (inproceedings)

Abstract
The work presented in this paper is motivated by the goal of dependable autonomous navigation of mobile robots. This goal is a fundamental requirement for having autonomous robots in spaces such as domestic spaces and public establishments, left unattended by technical staff. In this paper we tackle this problem by taking an optimization approach: on one hand, we use a Fast Marching Approach for path planning, resulting in optimal paths in the absence of unmapped obstacles, and on the other hand we use a Dynamic Window Approach for guidance. To the best of our knowledge, the combination of these two methods is novel. We evaluate the approach on a real mobile robot, capable of moving at high speed. The evaluation makes use of an external ground truth system. We report controlled experiments that we performed, including the presence of people moving randomly nearby the robot. In our long term experiments we report a total distance of 18 km traveled during 11 hours of movement time.

DOI [BibTex]

DOI [BibTex]


Thumb xl thumb teaser mrg
Metric Regression Forests for Correspondence Estimation

Pons-Moll, G., Taylor, J., Shotton, J., Hertzmann, A., Fitzgibbon, A.

International Journal of Computer Vision, pages: 1-13, 2015 (article)

springer PDF Project Page [BibTex]

springer PDF Project Page [BibTex]


Thumb xl geiger
Joint 3D Object and Layout Inference from a single RGB-D Image

(Best Paper Award)

Geiger, A., Wang, C.

In German Conference on Pattern Recognition (GCPR), 9358, pages: 183-195, Lecture Notes in Computer Science, Springer International Publishing, 2015 (inproceedings)

Abstract
Inferring 3D objects and the layout of indoor scenes from a single RGB-D image captured with a Kinect camera is a challenging task. Towards this goal, we propose a high-order graphical model and jointly reason about the layout, objects and superpixels in the image. In contrast to existing holistic approaches, our model leverages detailed 3D geometry using inverse graphics and explicitly enforces occlusion and visibility constraints for respecting scene properties and projective geometry. We cast the task as MAP inference in a factor graph and solve it efficiently using message passing. We evaluate our method with respect to several baselines on the challenging NYUv2 indoor dataset using 21 object categories. Our experiments demonstrate that the proposed method is able to infer scenes with a large degree of clutter and occlusions.

pdf suppmat video project DOI [BibTex]

pdf suppmat video project DOI [BibTex]


Thumb xl screen shot 2015 05 07 at 11.56.54
3D Object Class Detection in the Wild

Pepik, B., Stark, M., Gehler, P., Ritschel, T., Schiele, B.

In Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, 2015 (inproceedings)

Project Page [BibTex]

Project Page [BibTex]


Thumb xl menze
Discrete Optimization for Optical Flow

Menze, M., Heipke, C., Geiger, A.

In German Conference on Pattern Recognition (GCPR), 9358, pages: 16-28, Springer International Publishing, 2015 (inproceedings)

Abstract
We propose to look at large-displacement optical flow from a discrete point of view. Motivated by the observation that sub-pixel accuracy is easily obtained given pixel-accurate optical flow, we conjecture that computing the integral part is the hardest piece of the problem. Consequently, we formulate optical flow estimation as a discrete inference problem in a conditional random field, followed by sub-pixel refinement. Naive discretization of the 2D flow space, however, is intractable due to the resulting size of the label set. In this paper, we therefore investigate three different strategies, each able to reduce computation and memory demands by several orders of magnitude. Their combination allows us to estimate large-displacement optical flow both accurately and efficiently and demonstrates the potential of discrete optimization for optical flow. We obtain state-of-the-art performance on MPI Sintel and KITTI.

pdf suppmat project DOI [BibTex]

pdf suppmat project DOI [BibTex]


Thumb xl isa
Joint 3D Estimation of Vehicles and Scene Flow

Menze, M., Heipke, C., Geiger, A.

In Proc. of the ISPRS Workshop on Image Sequence Analysis (ISA), 2015 (inproceedings)

Abstract
Three-dimensional reconstruction of dynamic scenes is an important prerequisite for applications like mobile robotics or autonomous driving. While much progress has been made in recent years, imaging conditions in natural outdoor environments are still very challenging for current reconstruction and recognition methods. In this paper, we propose a novel unified approach which reasons jointly about 3D scene flow as well as the pose, shape and motion of vehicles in the scene. Towards this goal, we incorporate a deformable CAD model into a slanted-plane conditional random field for scene flow estimation and enforce shape consistency between the rendered 3D models and the parameters of all superpixels in the image. The association of superpixels to objects is established by an index variable which implicitly enables model selection. We evaluate our approach on the challenging KITTI scene flow dataset in terms of object and scene flow estimation. Our results provide a prove of concept and demonstrate the usefulness of our method.

PDF [BibTex]

PDF [BibTex]


Thumb xl teaser
A Setup for multi-UAV hardware-in-the-loop simulations

Odelga, M., Stegagno, P., Bülthoff, H., Ahmad, A.

In pages: 204-210, IEEE, 2015 (inproceedings)

Abstract
In this paper, we present a hardware in the loop simulation setup for multi-UAV systems. With our setup, we are able to command the robots simulated in Gazebo, a popular open source ROS-enabled physical simulator, using the computational units that are embedded on our quadrotor UAVs. Hence, we can test in simulation not only the correct execution of algorithms, but also the computational feasibility directly on the robot hardware. In addition, since our setup is inherently multi-robot, we can also test the communication flow among the robots. We provide two use cases to show the characteristics of our setup.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl subimage
Smooth Loops from Unconstrained Video

Sevilla-Lara, L., Wulff, J., Sunkavalli, K., Shechtman, E.

In Computer Graphics Forum (Proceedings of EGSR), 34(4):99-107, 2015 (inproceedings)

Abstract
Converting unconstrained video sequences into videos that loop seamlessly is an extremely challenging problem. In this work, we take the first steps towards automating this process by focusing on an important subclass of videos containing a single dominant foreground object. Our technique makes two novel contributions over previous work: first, we propose a correspondence-based similarity metric to automatically identify a good transition point in the video where the appearance and dynamics of the foreground are most consistent. Second, we develop a technique that aligns both the foreground and background about this transition point using a combination of global camera path planning and patch-based video morphing. We demonstrate that this allows us to create natural, compelling, loopy videos from a wide range of videos collected from the internet.

pdf link (url) DOI Project Page [BibTex]

pdf link (url) DOI Project Page [BibTex]


Thumb xl fotorobos
Formation control driven by cooperative object tracking

Lima, P., Ahmad, A., Dias, A., Conceição, A., Moreira, A., Silva, E., Almeida, L., Oliveira, L., Nascimento, T.

Robotics and Autonomous Systems, 63(1):68-79, 2015 (article)

Abstract
In this paper we introduce a formation control loop that maximizes the performance of the cooperative perception of a tracked target by a team of mobile robots, while maintaining the team in formation, with a dynamically adjustable geometry which is a function of the quality of the target perception by the team. In the formation control loop, the controller module is a distributed non-linear model predictive controller and the estimator module fuses local estimates of the target state, obtained by a particle filter at each robot. The two modules and their integration are described in detail, including a real-time database associated to a wireless communication protocol that facilitates the exchange of state data while reducing collisions among team members. Simulation and real robot results for indoor and outdoor teams of different robots are presented. The results highlight how our method successfully enables a team of homogeneous robots to minimize the total uncertainty of the tracked target cooperative estimate while complying with performance criteria such as keeping a pre-set distance between the teammates and the target, avoiding collisions with teammates and/or surrounding obstacles.

DOI [BibTex]

DOI [BibTex]


Thumb xl result overlayed
Onboard robust person detection and tracking for domestic service robots

Sanz, D., Ahmad, A., Lima, P.

In Robot 2015: Second Iberian Robotics Conference, pages: 547-559, Advances in Intelligent Systems and Computing ; 418, Springer, Cham, Switzerland, 2015 (inproceedings)

Abstract
Domestic assistance for the elderly and impaired people is one of the biggest upcoming challenges of our society. Consequently, in-home care through domestic service robots is identified as one of the most important application area of robotics research. Assistive tasks may range from visitor reception at the door to catering for owner's small daily necessities within a house. Since most of these tasks require the robot to interact directly with humans, a predominant robot functionality is to detect and track humans in real time: either the owner of the robot or visitors at home or both. In this article we present a robust method for such a functionality that combines depth-based segmentation and visual detection. The robustness of our method lies in its capability to not only identify partially occluded humans (e.g., with only torso visible) but also to do so in varying lighting conditions. We thoroughly validate our method through extensive experiments on real robot datasets and comparisons with the ground truth. The datasets were collected on a home-like environment set up within the context of RoboCup@Home and RoCKIn@Home competitions.

DOI [BibTex]

DOI [BibTex]

2010


Thumb xl screen shot 2012 12 01 at 2.37.12 pm
Visibility Maps for Improving Seam Carving

Mansfield, A., Gehler, P., Van Gool, L., Rother, C.

In Media Retargeting Workshop, European Conference on Computer Vision (ECCV), september 2010 (inproceedings)

webpage pdf slides supplementary code [BibTex]

2010

webpage pdf slides supplementary code [BibTex]


Thumb xl eigenclothingimagesmall2
A 2D human body model dressed in eigen clothing

Guan, P., Freifeld, O., Black, M. J.

In European Conf. on Computer Vision, (ECCV), pages: 285-298, Springer-Verlag, September 2010 (inproceedings)

Abstract
Detection, tracking, segmentation and pose estimation of people in monocular images are widely studied. Two-dimensional models of the human body are extensively used, however, they are typically fairly crude, representing the body either as a rough outline or in terms of articulated geometric primitives. We describe a new 2D model of the human body contour that combines an underlying naked body with a low-dimensional clothing model. The naked body is represented as a Contour Person that can take on a wide variety of poses and body shapes. Clothing is represented as a deformation from the underlying body contour. This deformation is learned from training examples using principal component analysis to produce eigen clothing. We find that the statistics of clothing deformations are skewed and we model the a priori probability of these deformations using a Beta distribution. The resulting generative model captures realistic human forms in monocular images and is used to infer 2D body shape and pose under clothing. We also use the coefficients of the eigen clothing to recognize different categories of clothing on dressed people. The method is evaluated quantitatively on synthetic and real images and achieves better accuracy than previous methods for estimating body shape under clothing.

pdf data poster Project Page [BibTex]

pdf data poster Project Page [BibTex]


Thumb xl teaser eccvw
Analyzing and Evaluating Markerless Motion Tracking Using Inertial Sensors

Baak, A., Helten, T., Müller, M., Pons-Moll, G., Rosenhahn, B., Seidel, H.

In European Conference on Computer Vision (ECCV Workshops), September 2010 (inproceedings)

pdf [BibTex]

pdf [BibTex]


Thumb xl testing results 1
Trainable, Vision-Based Automated Home Cage Behavioral Phenotyping

Jhuang, H., Garrote, E., Edelman, N., Poggio, T., Steele, A., Serre, T.

In Measuring Behavior, August 2010 (inproceedings)

pdf [BibTex]

pdf [BibTex]


Thumb xl graspimagesmall
Decoding complete reach and grasp actions from local primary motor cortex populations

(Featured in Nature’s Research Highlights (Nature, Vol 466, 29 July 2010))

Vargas-Irwin, C. E., Shakhnarovich, G., Yadollahpour, P., Mislow, J., Black, M. J., Donoghue, J. P.

J. of Neuroscience, 39(29):9659-9669, July 2010 (article)

pdf pdf from publisher Movie 1 Movie 2 Project Page [BibTex]

pdf pdf from publisher Movie 1 Movie 2 Project Page [BibTex]


Thumb xl teaser cvpr2010
Multisensor-Fusion for 3D Full-Body Human Motion Capture

Pons-Moll, G., Baak, A., Helten, T., Müller, M., Seidel, H., Rosenhahn, B.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2010 (inproceedings)

project page pdf [BibTex]

project page pdf [BibTex]


Thumb xl deblur small
Coded exposure imaging for projective motion deblurring

Tai, Y., Kong, N., Lin, S., Shin, S. Y.

In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages: 2408-2415, June 2010 (inproceedings)

Abstract
We propose a method for deblurring of spatially variant object motion. A principal challenge of this problem is how to estimate the point spread function (PSF) of the spatially variant blur. Based on the projective motion blur model of, we present a blur estimation technique that jointly utilizes a coded exposure camera and simple user interactions to recover the PSF. With this spatially variant PSF, objects that exhibit projective motion can be effectively de-blurred. We validate this method with several challenging image examples.

Publisher site [BibTex]

Publisher site [BibTex]


Thumb xl cvpr10
Tracking people interacting with objects

Kjellstrom, H., Kragic, D., Black, M. J.

In IEEE Conf. on Computer Vision and Pattern Recognition, CVPR, pages: 747-754, June 2010 (inproceedings)

pdf Video [BibTex]

pdf Video [BibTex]


Thumb xl contourpersonimagesmall
Contour people: A parameterized model of 2D articulated human shape

Freifeld, O., Weiss, A., Zuffi, S., Black, M. J.

In IEEE Conf. on Computer Vision and Pattern Recognition, (CVPR), pages: 639-646, IEEE, June 2010 (inproceedings)

pdf slides video of CVPR talk Project Page [BibTex]

pdf slides video of CVPR talk Project Page [BibTex]


Thumb xl secretsimagesmall2
Secrets of optical flow estimation and their principles

Sun, D., Roth, S., Black, M. J.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 2432-2439, IEEE, June 2010 (inproceedings)

pdf Matlab code code copryright notice [BibTex]

pdf Matlab code code copryright notice [BibTex]


Thumb xl ijcvcoverhd
Guest editorial: State of the art in image- and video-based human pose and motion estimation

Sigal, L., Black, M. J.

International Journal of Computer Vision, 87(1):1-3, March 2010 (article)

pdf from publisher [BibTex]

pdf from publisher [BibTex]


Thumb xl humanevaimagesmall2
HumanEva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion

Sigal, L., Balan, A., Black, M. J.

International Journal of Computer Vision, 87(1):4-27, Springer Netherlands, March 2010 (article)

Abstract
While research on articulated human motion and pose estimation has progressed rapidly in the last few years, there has been no systematic quantitative evaluation of competing methods to establish the current state of the art. We present data obtained using a hardware system that is able to capture synchronized video and ground-truth 3D motion. The resulting HumanEva datasets contain multiple subjects performing a set of predefined actions with a number of repetitions. On the order of 40,000 frames of synchronized motion capture and multi-view video (resulting in over one quarter million image frames in total) were collected at 60 Hz with an additional 37,000 time instants of pure motion capture data. A standard set of error measures is defined for evaluating both 2D and 3D pose estimation and tracking algorithms. We also describe a baseline algorithm for 3D articulated tracking that uses a relatively standard Bayesian framework with optimization in the form of Sequential Importance Resampling and Annealed Particle Filtering. In the context of this baseline algorithm we explore a variety of likelihood functions, prior models of human motion and the effects of algorithm parameters. Our experiments suggest that image observation models and motion priors play important roles in performance, and that in a multi-view laboratory environment, where initialization is available, Bayesian filtering tends to perform well. The datasets and the software are made available to the research community. This infrastructure will support the development of new articulated motion and pose estimation algorithms, will provide a baseline for the evaluation and comparison of new methods, and will help establish the current state of the art in human pose estimation and tracking.

pdf pdf from publisher [BibTex]

pdf pdf from publisher [BibTex]


no image
Modellbasierte Echtzeit-Bewegungsschätzung in der Fluoreszenzendoskopie

Stehle, T., Wulff, J., Behrens, A., Gross, S., Aach, T.

In Bildverarbeitung für die Medizin, 574, pages: 435-439, CEUR Workshop Proceedings, 2010 (inproceedings)

pdf [BibTex]

pdf [BibTex]


Thumb xl acva2010
Robust one-shot 3D scanning using loopy belief propagation

Ulusoy, A., Calakli, F., Taubin, G.

In Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on, pages: 15-22, IEEE, 2010 (inproceedings)

Abstract
A structured-light technique can greatly simplify the problem of shape recovery from images. There are currently two main research challenges in design of such techniques. One is handling complicated scenes involving texture, occlusions, shadows, sharp discontinuities, and in some cases even dynamic change; and the other is speeding up the acquisition process by requiring small number of images and computationally less demanding algorithms. This paper presents a “one-shot” variant of such techniques to tackle the aforementioned challenges. It works by projecting a static grid pattern onto the scene and identifying the correspondence between grid stripes and the camera image. The correspondence problem is formulated using a novel graphical model and solved efficiently using loopy belief propagation. Unlike prior approaches, the proposed approach uses non-deterministic geometric constraints, thereby can handle spurious connections of stripe images. The effectiveness of the proposed approach is verified on a variety of complicated real scenes.

pdf link (url) DOI [BibTex]

pdf link (url) DOI [BibTex]


Thumb xl screen shot 2012 12 01 at 2.43.22 pm
Scene Carving: Scene Consistent Image Retargeting

Mansfield, A., Gehler, P., Van Gool, L., Rother, C.

In European Conference on Computer Vision (ECCV), 2010 (inproceedings)

webpage+code pdf supplementary poster [BibTex]

webpage+code pdf supplementary poster [BibTex]


Thumb xl new thumb incos
Epione: An Innovative Pain Management System Using Facial Expression Analysis, Biofeedback and Augmented Reality-Based Distraction

Georgoulis, S., Eleftheriadis, S., Tzionas, D., Vrenas, K., Petrantonakis, P., Hadjileontiadis, L. J.

In Proceedings of the 2010 International Conference on Intelligent Networking and Collaborative Systems, pages: 259-266, INCOS ’10, IEEE Computer Society, Washington, DC, USA, 2010 (inproceedings)

Abstract
An innovative pain management system, namely Epione, is presented here. Epione deals with three main types of pain, i.e., acute pain, chronic pain, and phantom limb pain. In particular, by using facial expression analysis, Epione forms a dynamic pain meter, which then triggers biofeedback and augmented reality-based destruction scenarios, in an effort to maximize patient's pain relief. This unique combination sets Epione not only a novel pain management approach, but also a means that provides an understanding and integration of the needs of the whole community involved i.e., patients and physicians, in a joint attempt to facilitate easing of their suffering, provide efficient monitoring and contribute to a better quality of life.

Paper Project Page DOI [BibTex]

Paper Project Page DOI [BibTex]


Thumb xl new thumb dsai
Phantom Limb Pain Management Using Facial Expression Analysis, Biofeedback and Augmented Reality Interfacing

Tzionas, D., Vrenas, K., Eleftheriadis, S., Georgoulis, S., Petrantonakis, P. C., Hadjileontiadis, L. J.

In Proceedings of the 3rd International Conferenceon Software Development for EnhancingAccessibility and Fighting Info-Exclusion, pages: 23-30, DSAI ’10, UTAD - Universidade de Trás-os-Montes e Alto Douro, 2010 (inproceedings)

Abstract
Post-amputation sensation often translates to the feeling of severe pain in the missing limb, referred to as phantom limb pain (PLP). A clear and rational treatment regimen is difficult to establish, as long as the underlying pathophysiology is not fully known. In this work, an innovative PLP management system is presented, as a module of an holistic computer-mediated pain management environment, namely Epione. The proposed Epione-PLP scheme is structured upon advanced facial expression analysis, used to form a dynamic pain meter, which, in turn, is used to trigger biofeedback and augmented reality-based PLP distraction scenarios. The latter incorporate a model of the missing limb for its visualization, in an effort to provide to the amputee the feeling of its existence and control, and, thus, maximize his/her PLP relief. The novel Epione-PLP management approach integrates edge-technology within the context of personalized health and it could be used to facilitate easing of PLP patients' suffering, provide efficient progress monitoring and contribute to the increase in their quality of life.

Paper Project Page link (url) [BibTex]

Paper Project Page link (url) [BibTex]


Thumb xl ncomm fig2
Automated Home-Cage Behavioral Phenotyping of Mice

Jhuang, H., Garrote, E., Mutch, J., Poggio, T., Steele, A., Serre, T.

Nature Communications, Nature Communications, 2010 (article)

software, demo pdf [BibTex]

software, demo pdf [BibTex]


no image
An automated action initiation system reveals behavioral deficits in MyosinVa deficient mice

Pandian, S., Edelman, N., Jhuang, H., Serre, T., Poggio, T., Constantine-Paton, M.

Society for Neuroscience, 2010 (conference)

pdf [BibTex]

pdf [BibTex]


Thumb xl vista
Dense Marker-less Three Dimensional Motion Capture

Soren Hauberg, Bente Rona Jensen, Morten Engell-Norregaard, Kenny Erleben, Kim S. Pedersen

In Virtual Vistas; Eleventh International Symposium on the 3D Analysis of Human Movement, 2010 (inproceedings)

Conference site [BibTex]

Conference site [BibTex]


Thumb xl jampani10 msr
ImageFlow: Streaming Image Search

Jampani, V., Ramos, G., Drucker, S.

MSR-TR-2010-148, Microsoft Research, Redmond, 2010 (techreport)

Abstract
Traditional grid and list representations of image search results are the dominant interaction paradigms that users face on a daily basis, yet it is unclear that such paradigms are well-suited for experiences where the user‟s task is to browse images for leisure, to discover new information or to seek particular images to represent ideas. We introduce ImageFlow, a novel image search user interface that ex-plores a different alternative to the traditional presentation of image search results. ImageFlow presents image results on a canvas where we map semantic features (e.g., rele-vance, related queries) to the canvas‟ spatial dimensions (e.g., x, y, z) in a way that allows for several levels of en-gagement – from passively viewing a stream of images, to seamlessly navigating through the semantic space and ac-tively collecting images for sharing and reuse. We have implemented our system as a fully functioning prototype, and we report on promising, preliminary usage results.

url pdf link (url) [BibTex]

url pdf link (url) [BibTex]


Thumb xl accv2010
Stick It! Articulated Tracking using Spatial Rigid Object Priors

Soren Hauberg, Kim S. Pedersen

In Computer Vision – ACCV 2010, 6494, pages: 758-769, Lecture Notes in Computer Science, (Editors: Kimmel, Ron and Klette, Reinhard and Sugimoto, Akihiro), Springer Berlin Heidelberg, 2010 (inproceedings)

Publishers site Paper site Code PDF [BibTex]

Publishers site Paper site Code PDF [BibTex]


Thumb xl eccv2010a
Gaussian-like Spatial Priors for Articulated Tracking

Soren Hauberg, Stefan Sommer, Kim S. Pedersen

In Computer Vision – ECCV 2010, 6311, pages: 425-437, Lecture Notes in Computer Science, (Editors: Daniilidis, Kostas and Maragos, Petros and Paragios, Nikos), Springer Berlin Heidelberg, 2010 (inproceedings)

Publishers site Paper site Code PDF [BibTex]

Publishers site Paper site Code PDF [BibTex]


no image
Reach to grasp actions in rhesus macaques: Dimensionality reduction of hand, wrist, and upper arm motor subspaces using principal component analysis

Vargas-Irwin, C., Franquemont, L., Shakhnarovich, G., Yadollahpour, P., Black, M., Donoghue, J.

2010 Abstract Viewer and Itinerary Planner, Society for Neuroscience, 2010, Online (conference)

[BibTex]

[BibTex]


Thumb xl nips2010layersimagesmall
Layered image motion with explicit occlusions, temporal consistency, and depth ordering

Sun, D., Sudderth, E., Black, M. J.

In Advances in Neural Information Processing Systems 23 (NIPS), pages: 2226-2234, MIT Press, 2010 (inproceedings)

Abstract
Layered models are a powerful way of describing natural scenes containing smooth surfaces that may overlap and occlude each other. For image motion estimation, such models have a long history but have not achieved the wide use or accuracy of non-layered methods. We present a new probabilistic model of optical flow in layers that addresses many of the shortcomings of previous approaches. In particular, we define a probabilistic graphical model that explicitly captures: 1) occlusions and disocclusions; 2) depth ordering of the layers; 3) temporal consistency of the layer segmentation. Additionally the optical flow in each layer is modeled by a combination of a parametric model and a smooth deviation based on an MRF with a robust spatial prior; the resulting model allows roughness in layers. Finally, a key contribution is the formulation of the layers using an image dependent hidden field prior based on recent models for static scene segmentation. The method achieves state-of-the-art results on the Middlebury benchmark and produces meaningful scene segmentations as well as detected occlusion regions.

main paper supplemental material paper and supplemental material in one pdf file Project Page [BibTex]


Thumb xl eccv2010b
Manifold Valued Statistics, Exact Principal Geodesic Analysis and the Effect of Linear Approximations

Stefan Sommer, Francois Lauze, Soren Hauberg, Mads Nielsen

In Computer Vision – ECCV 2010, 6316, pages: 43-56, (Editors: Daniilidis, Kostas and Maragos, Petros and Paragios, Nikos), Springer Berlin Heidelberg, 2010 (inproceedings)

Publishers site PDF [BibTex]

Publishers site PDF [BibTex]