Deep Unrolling for Light Field Compressed Acquisition using Coded Masks

Abstract

Compressed sensing using color-coded masks has been recently considered for capturing light fields using a small number of measurements. Such an acquisition scheme is very practical, since any consumer-level camera can be turned into a light field acquisition camera by simply adding a coded mask in front of the sensor. We present an efficient and mathematically grounded deep learning model to reconstruct a light field from a set of measurements obtained using a color-coded mask and a color filter array (CFA). Following the promising trend of unrolling optimization algorithms with learned priors, we formulate our task of light field reconstruction as an inverse problem and derive a principled deep network architecture from this formulation. We also introduce a closed-form extraction of information from the acquisition, while similar methods found in the recent literature systematically use an approximation. Compared to similar deep learning methods, we show that our approach allows for a better reconstruction quality. We further show that our approach is robust to noise using realistic simulations of the sensing acquisition process. In addition, we show that our framework allows for the optimization of the physical components of the acquisition device, namely the color distribution on the coded mask and the CFA pattern.

Overview of the method

Our method reconstructs light fields that are compressively acquired in a framework using a color-coded mask and a color filter array. The light field is first filtered by the color-coded mask, effectively performing a multiplexing in both the angular and the spectral domains, before being filtered by the color filter array, placed directly before the sensor. The filtered light field is subsequently recorded using a traditional monochromatic photosensor. This framework effectively realizes a linear projection of the light field onto a monochromatic image. By additionally allowing the sensor to perform a motion of translation, it is possible to record several shots of the light field, which is usually greatly beneficial to the signal reconstruction quality.

Our architecture then performs a reconstruction of the full light field using the monochromatic measurements. The architecture is a deep neural network designed by unrolling the half-quadratic splitting optimization algorithm. While traditional optimization algorithms generally require a very large number of iterations to converge, in the context of optimization unrolling, one usually perform a small number of iterations. The neural network obtained by this unrolling performs a reconstruction of the signal by successive refinement of a tentative reconstruction. In consists of an alternation of data-term minimization layers and learned proximal operator. The data-term minimization layer enforces the consistency of the intermediate reconstruction with the measures, while the proximal operator can be interpreted either as a denoiser, or as the projection of a signal onto the sub-manifold of "natural" light fields. The proximal operator is learned in an end-to-end framework. We design a data-term minimization layer that performs an efficient closed-form solving of a data-fidelity term.

Ground truth