ESR 5: Learning algorithms for inverse problems with new imaging modalities
Context : The task of acquiring high-quality immersive content (light fields, omni-directional videos) with a sufficiently high angular and spatial resolution remains technologically challenging, due to the complexity and size of optics, due to the limitations of photosensors and, ultimately, because of the bottleneck of data storage. Capturing devices indeed often exhibit trade-offs between the sampling resolutions in the various spatial, angular or temporal dimensions, and produce data contaminated by sensor noise or optical aberrations. Reconstructing the imaged scene, with a sufficient resolution and quality, and in a way that we can observe it from – almost – continuously varying positions or angles in space is an important challenge for a wide adoption in applications. In addition, the ability to analyze scenes with reflections, gloss or transparency and to separate these different components lends itself to a variety of applications like appearance transfer and augmented reality but can also facilitate compression, depth estimation, view synthesis, etc…). Although, reconstructing a scene with a higher quality on one hand, and separating its specular, diffuse and transparent components on the other hand, may look as two different problems, they both actually are inverse problems that can be posed and solved in very close manner. So, our motivation here is to develop algorithms in a deep learning framework for solving such inverse problems with emerging imaging modalities.
Objectives : The overall goal will be to develop learning algorithms for solving inverse problems in computer vision with novel imaging modalities. Inverse Problems refer to a broad class of problems encountered in many scientific disciplines, from natural and life sciences to engineering. The inverse problems that we will target are denoising, de-blurring, super-resolution, view synthesis, and blind source separation (e.g. separation of specular and diffuse components of a scene).
Research questions : The proposed research is at the intersection between computer vision, inverse problems and machine learning. The research questions that will be addressed are the following:
Designing deep learning architectures for learning priors from the test data (in particular for light fields and omni-directional images, taking into account specific structures of these imaging modalities) in a totally unsupervised manner, to limit the amount of external training data usually required. We will also investigate how handcrafted priors like sparse or low rank models can contribute to reduce the need for large amounts of training data.
Developing optimization algorithms, using these deep priors as regularizers, for solving inverse problems (super-resolution, denoising, specular/diffuse components separation) with novel imaging modalities.
Planned secondments : Secondments to other network partners for up to 11 months have been planned. The secondments include
investigating different optimization algorithms when employing deep light field priors, and the evaluation of developed algorithms in visual effects production at Foundry.
Double degree with Mittuniversitetet (Mid Sweden University).
ESR6: Sensing and reconstruction of plenoptic point clouds
Context: 3D point clouds have become the representation of choice for 3D computer graphics objects. This is largely due to their flexibility in representing both manifold and non-manifold geometry, and their potential for real-time processing. The concept of surface light field (SLF) has in parallel been introduced as a function that assigns a color to each ray originating on a surface, and this in order to construct virtual images of shiny objects under complex lighting conditions. This concept aims at combining the best of light fields and computer graphics modeling, for photo-realistic rendering from arbitrary points of view. Another concept related to SLF is the concept of plenoptic point cloud. A plenoptic point cloud is essentially a point cloud representation of the plenoptic function. More precisely, if the (x; y; z) points are defined directly on the surface of a 3D object, the plenoptic point cloud is equivalent to a SLF. The plenoptic point cloud (PPC) can, in this case, be seen as a natural extension of a 3D point cloud to a Surface Light Field (SLF), and point cloud codecs can be used to compress SLF. The PPC and SLF representations combine versatility and convenience of a point cloud together with the richness of the visual information provided by a light field.
Objectives: The overall goal will be to develop methods for plenoptic point cloud sensing. More precisely, the project will target the following objectives:
Design of scene analysis methods from light fields that would allow separating specular from diffuse surfaces.
Design of depth estimation methods suitable for both specular and diffuse scenes.
Design of methods for constructing plenoptic point clouds from light fields using the specular/diffusion separation and corresponding depth estimation methods.
Study surface light field representations based on explicit geometry for real-world scenes.
The proposed research is at the intersection between computer vision, signal processing and computer graphics. The project will address the following questions:
How to exploit full light field data to separate diffuse and specular scene components?
How to estimate depth with scenes having diverse surface materials and complex lighting conditions
How to construct plenoptic point cloud and surface light field representations, from real captures, in which the color of each point in the point cloud is a function of the viewing direction?
Planned secondments: Secondments to other network partners for up to 11 months have been planned. The secondments include
study surface light field representations based on explicit geometry for real-world scenes, and testing developed algorithms on use case of heavy mining equipment, at Sandvik.
Double degree with Tampere University.
ESR7: Spherical light field representation and reconstruction from omni-directional imagery
Context: The concept of 6DOF video content has recently emerged with the goal of enabling immersive experience in terms of free roaming, i.e. allowing viewing the scene from any viewpoint and direction in space. However, no such real-life full 6DOF light field capturing solution exists so far. Omnidirectional cameras allow capturing a panoramic scene with a 360 field of view but do not record information on the orientation of light rays emitted by the scene. On the contrary, light field cameras allow recording orientations of light rays, hence sampling the plenoptic function in all directions, but with a limited field of view. The motivation here is to be able to capture or reconstruct light fields with a large field of view (with a 360 angle of view) from one or several omni-directional images. Reconstructing the light field implies recovering light ray orientations that are related to the depth of the 3D emitting point in the scene.
Objectives: The objective will be to create parallax in all 360, by developing depth estimation algorithms from a monocular or from multiple omni-directional video captures in order to reconstruct a light field with a 360 angle of view. This will eventually lead to a spherical light field, of the plenoptic function, that is a representation of light rays on a sphere.
The research questions are the following
Setting an omnidirectional capture process able to generate parallax while keeping the 360 field of view
Estimating the 360 depth map from the parallax and from the specific spherical camera geometry
Building virtual view synthesis algorithms to enable 6DOF navigation in the Scene
Planned secondments: Secondments to other network partners for up to 11 months have been planned. The secondments include
implementation of 360° depth estimation algorithms in spherical coordinates, and evaluation of the proposed spherical representation in practical scenarios at IMEC
Double degree with Technical University Berlin