We introduced the use of point-based 3D models as a shape prior for real-time 3D tracking with a monocular camera. The joint use of point-based 3D models together with GPU allows to adapt and simplify an existing tracking algorithm originally designed for triangular meshes. Point-based models are of particular interest in this context, because they are the direct output of most laser scanners. We show that state-of-the-art techniques developed for point-based rendering can be used to compute in real-time intermediate values required for visual tracking. In particular, apparent motion predictors at each pixel are computed in parallel, and novel views of the tracked object are generated online to help wide-baseline matching. Both computations derive from the same general surface splatting technique which we implement, along with other low-level vision tasks, on the GPU, leading to a real-time tracking algorithm
Plants are essential elements of virtual worlds to get pleasant and realistic 3D environments. Even if mature computer vision techniques allow the reconstruction of challenging 3D objects from images, due to high complexity of plant topology, dedicated methods for generating 3D plant models must be devised. We propose an analysis-by-synthesis method which generates 3D models of a plant from both images and a priori knowledge of the plant species.
Our method is based on a skeletonisation algorithm which allows to generate a possible skeleton from a foliage segmentation. Then, a 3D generative model, based on a parametric model of branching systems that takes into account botanical knowledge is built. This method extends previous works by constraining the resulting skeleton to follow hierarchical organisation of natural branching structure. A rst instance of a 3D model is generated. A reprojection of this model is compared with the original image. Then, we show that selecting the model from multiple proposals for the main branching structure of the plant and for the foliage improves the quality of the generated 3D model. Varying parameter values of the generative model, we produce a series of candidate models. A criterion based on comparing 3D virtual plant reprojection with original image selects the best model.
We introduced a new unified Structure- from-Motion (SfM) paradigm in which images of circular point-pairs can be combined with images of natural points. An imaged circular point-pair encodes the 2D Euclidean structure of a world plane and can easily be derived from the image of a planar shape, especially those including circles. A classical SfM method generally runs two steps: first a projective factorization of all matched image points (into projective cameras and points) and second a camera self- calibration that updates the obtained world from projective to Euclidean. This work shows how to introduce images of circular points in these two SfM steps while its key contribution is to provide the theoretical foundations for combining “classical” linear self-calibration constraints with additional ones derived from such images.
Estimating the shape and appearance of an object, given one or several images, is still an open and
challenging research problem called 3D-reconstruction. Among the different techniques available, photometric stereo produces highly accurate results when the lighting conditions have been identified. When these conditions are unknown, the problem becomes the so-called uncalibrated photometric stereo problem, which is ill-posed. We showed how total variation (TV) can be used to reduce the ambiguities of uncalibrated photometric stereo, and we will study two methods for estimating the parameters of the generalized bas-relief ambiguity.
Online galleries of 3D models typically provide two ways to preview a model before the model is downloaded and viewed by the user: (i) by showing a set of thumbnail images of the 3D model taken from representative views (or keyviews); (ii) by showing a video of the 3D model as viewed from a moving virtual camera along a path determined by the content provider. We propose a third approach called preview streaming for mesh-based 3D objects: by streaming and showing parts of the mesh surfaces visible along the virtual camera path. This work focuses on the preview streaming architecture and framework and presents our investigation into how such a system would best handle network congestion effectively. We introduced three basic methods: stop-and-wait, where the camera pauses until sufficient data is buffered; reduce-speed, where the camera slows down in accordance to reduce network bandwidth; reduce-quality, where the camera continues to move at the same speed but fewer vertices are sent and displayed, leading to lower mesh quality.
We further propose two advanced methods: keyview-aware, which trades off mesh quality and camera speed appropriately depending on how close the current view is to the keyviews, and adaptive-zoom, which improves visual quality by moving the virtual camera away from the original path. A user study reveals that our keyview-aware method is preferred over the basic methods. Moreover, the adaptive-zoom scheme compares favorably to the keyview-aware method, showing that path adaptation is a viable approach to handling bandwidth variation.
The aim of the project ROM (“Real-time On set Matchmoving”) is to bridge the gap between the production and the post production process in filmaking. The main idea is to develop a system that allows the director of the movie to have at least a rough preview of the digital effects (3D rendering) that will be later added during the post-production process. This requires the development of tools for real-time camera tracking able to recover the position of the camera using either natural features or artificial reference markers, or even both of them.
The novelty of the research relies on three main aspects:
The spread of Smartphones in our society allows to imagine new services for users, who will take advantage of this technology. Many smartphone apps offer nowadays services based on geolocalization and some of them integrate augmented reality through the display of points of interest (restaurants, monuments, etc…) superimposed on the image caught by the camera. These apps use GPS and heading sensors integrated in the smartphone to determine the elements to be displayed.
Web site of the project: Project TOURA
We designed the C2Tag, a marker composed of a set of concentric circles of different radii. Thanks to their photometric and geometric properties, the marker are easily detectable and very robust to occlusions. C2Tag can be used to encode information using the ratio between the different radii to code a small amount of information, thus being suitable for augmented reality applications.
In computer vision, they can be used for camera calibration and for camera tracking, as the circles intrinsicly encode some projective properties that allows to compute the pose of the camera.
Marker detection and camera pose estimation for Augmented Applications