Table of Contents

Research Activities


Stereo Camera tracking on mobile phones (Project MOOV3D)

The objective of MOOV3D is to build a prototype mobile platform equipped with a stereoscopic camera, and several different types of 3D display output: an auto-stereoscopic screen on the mobile device, 3D glasses, or 3D high-definition television connected to the mobile. In the context of this project we developed a real time stereo tracker that enables augmented reality applications on the mobile prototype, possibly with a stereo output, so that it can be seen with an auto-stereoscopic screen or with 3D glasses. The tracker is based on the stereo images provided by the two camera on board of the mobile platform. It reconstruct matching points from the stereo pairs and then track them over the next frames. At each frame the correspondences 2D-3D are used to estimate the position of the camera. The tracker is implemented on Android 2.3, using OpenCV and JNI.

Some more information and work in progress results can be found here

Code available: the forge page here

Related publications:

http://ubee.enseeiht.fr/dokuwiki/lib/exe/fetch.php?media=vortex:stereojme.ogv|480x360


Real-time Camera tracking for visual effects (ROM Project)

The aim of the project ROM (“Real-time On set Matchmoving”) is to bridge the gap between the production and the post production process in filmaking. The main idea is to develop a system that allows the director of the movie to have at least a rough preview of the digital effects (3D rendering) that will be later added during the post-production process. This requires the development of tools for real-time camera tracking able to recover the position of the camera using either natural features or artificial reference markers, or even both of them. The novelty of the research relies on three main aspects:

small:xp2hd-2cmJU small:unybb0MFCOo

Code available: coming soon!


3D Visual Tracking

We introduced the use of point-based 3D models as a shape prior for real-time 3D tracking with a monocular camera. The joint use of point-based 3D models together with GPU allows to adapt and simplify an existing tracking algorithm originally designed for triangular meshes. Point-based models are of particular interest in this context, because they are the direct output of most laser scanners. We show that state-of-the-art techniques developed for point-based rendering can be used to compute in real-time intermediate values required for visual tracking. In particular, apparent motion predictors at each pixel are computed in parallel, and novel views of the tracked object are generated online to help wide-baseline matching. Both computations derive from the same general surface splatting technique which we implement, along with other low-level vision tasks, on the GPU, leading to a real-time tracking algorithm

http://ubee.enseeiht.fr/dokuwiki/lib/exe/fetch.php?media=vortex:leopard_occlusions.mp4|320x240

http://ubee.enseeiht.fr/dokuwiki/lib/exe/fetch.php?media=vortex:leopard_AR.mp4|320x240

Related Publications:


Visual Markers

We designed the C2Tag, a marker composed of a set of concentric circles of different radii. Thanks to their photometric and geometric properties, the marker are easily detectable and very robust to occlusions. C2Tag can be used to encode information using the ratio between the different radii to code a small amount of information, thus being suitable for augmented reality applications. In computer vision, they can be used for camera calibration and for camera tracking, as the circles intrinsicly encode some projective properties that allows to compute the pose of the camera.

Robust marker detection for Augmented Ads

Shared patent with

Marker detection and camera pose estimation for Augmented Applications http://ubee.enseeiht.fr/dokuwiki/lib/exe/fetch.php?media=vortex:cones.mp4

Related publications:


Structure From Motion

We introduced a new unified Structure- from-Motion (SfM) paradigm in which images of circular point-pairs can be combined with images of natural points. An imaged circular point-pair encodes the 2D Euclidean structure of a world plane and can easily be derived from the image of a planar shape, especially those including circles. A classical SfM method generally runs two steps: first a projective factorization of all matched image points (into projective cameras and points) and second a camera self- calibration that updates the obtained world from projective to Euclidean. This work shows how to introduce images of circular points in these two SfM steps while its key contribution is to provide the theoretical foundations for combining “classical” linear self-calibration constraints with additional ones derived from such images.

Related publications:


Augmented reality and relighting

Photorealism is key feature for any augmented reality application. While many works have been done aimed at improving the robustness of the camera tracking in order to achieve a smooth blending of the virtual object in the scene, the relighting problem is still an open issue. In particular the estimation of the ambient light is a fundamental aspect in order to provide a photorealistic rendering of the virtual object that is coherent with the global illumination. In collaboration with our industrial partner Fitting Box, we are studying the relighting problem in the context of their virtual eyewear try-on solution. Estimating the direction and the intensity of the ambient light allow create photorealistic effects on the virtual glasses while the user is moving her head in front of the camera.

http://ubee.enseeiht.fr/dokuwiki/lib/exe/fetch.php?media=vortex:videoeclairage1.ogv|480x360 The direction of the light in the scene is estimated in real-team in order to produce a more photo-realistic rendering.

http://ubee.enseeiht.fr/dokuwiki/lib/exe/fetch.php?media=vortex:relighting.ogv|480x360

Relighting applied to an augmented reality application that allows users to try glasses on.

This work has been done in collaboration with http://www.fittingbox.com/.


Photometric Stereo (Multiple lightings)

Estimating the shape and appearance of an object, given one or several images, is still an open and challenging research problem called 3D-reconstruction. Among the different techniques available, photometric stereo produces highly accurate results when the lighting conditions have been identified. When these conditions are unknown, the problem becomes the so-called uncalibrated photometric stereo problem, which is ill-posed. We showed how total variation (TV) can be used to reduce the ambiguities of uncalibrated photometric stereo, and we will study two methods for estimating the parameters of the generalized bas-relief ambiguity.

http://ubee.enseeiht.fr/dokuwiki/lib/exe/fetch.php?media=vortex:Toulouse+Tech+Transfer-HD.ogv|480x360 The 3D reconstruction pipeline in our experiment room

This research work is undergoing a technology transfer project supported by http://www.toulouse-tech-transfer.com/

Related publications:


TOURA project - Mobile Augmented Reality for Cultural heritage

The spread of Smartphones in our society allows to imagine new services for users, who will take advantage of this technology. Many smartphone apps offer nowadays services based on geolocalization and some of them integrate augmented reality through the display of points of interest (restaurants, monuments, etc…) superimposed on the image caught by the camera. These apps use GPS and heading sensors integrated in the smartphone to determine the elements to be displayed.

Web site of the project: Project TOURA


SICASSE project - Mobile AR for Environment monitoring


Adaptive 3D streaming

Online galleries of 3D models typically provide two ways to preview a model before the model is downloaded and viewed by the user: (i) by showing a set of thumbnail images of the 3D model taken from representative views (or keyviews); (ii) by showing a video of the 3D model as viewed from a moving virtual camera along a path determined by the content provider. We propose a third approach called preview streaming for mesh-based 3D objects: by streaming and showing parts of the mesh surfaces visible along the virtual camera path. This work focuses on the preview streaming architecture and framework and presents our investigation into how such a system would best handle network congestion effectively. We introduced three basic methods:

  1. stop-and-wait, where the camera pauses until sufficient data is buffered;
  2. reduce-speed, where the camera slows down in accordance to reduce network bandwidth;
  3. reduce-quality, where the camera continues to move at the same speed but fewer vertices are sent and displayed, leading to lower mesh quality.

We further propose two advanced methods:

  1. keyview-aware, which trades off mesh quality and camera speed appropriately depending on how close the current view is to the keyviews, and
  2. adaptive-zoom, which improves visual quality by moving the virtual camera away from the original path. A user study reveals that our keyview-aware method is preferred over the basic methods. Moreover, the adaptive-zoom scheme compares favorably to the keyview-aware method, showing that path adaptation is a viable approach to handling bandwidth variation.

Related publications:


Plant Modeling from Images based on Analysis-by-Synthesis

Plants are essential elements of virtual worlds to get pleasant and realistic 3D environments. Even if mature computer vision techniques allow the reconstruction of challenging 3D objects from images, due to high complexity of plant topology, dedicated methods for generating 3D plant models must be devised. We propose an analysis-by-synthesis method which generates 3D models of a plant from both images and a priori knowledge of the plant species. Our method is based on a skeletonisation algorithm which allows to generate a possible skeleton from a foliage segmentation. Then, a 3D generative model, based on a parametric model of branching systems that takes into account botanical knowledge is built. This method extends previous works by constraining the resulting skeleton to follow hierarchical organisation of natural branching structure. A rst instance of a 3D model is generated. A reprojection of this model is compared with the original image. Then, we show that selecting the model from multiple proposals for the main branching structure of the plant and for the foliage improves the quality of the generated 3D model. Varying parameter values of the generative model, we produce a series of candidate models. A criterion based on comparing 3D virtual plant reprojection with original image selects the best model.

Relevant publications: