Thomas Pock^{}

In this talk I will highlight connections between recent deep neural networks and classical methods for solving inverse problems in computer vision and image processing. I will focus on variational methods, graphical models which are known to be extremely flexible and also come along with a deep theoretical understanding. It turns out that many iterative algorithms for solving variational and graphical models can be unrolled and hence interpreted as layers in a deep neural network. The structure provided by these methods helps in reducing the number of model parameters and hence are less prone to overfitting. Moreover, the structure helps in interpreting the learned model parameters. I will show applications to stereo, motion and image reconstruction.

Energy minimization methods are certainly one of the most successful and flexible mathematical frameworks for solving inverse problems in computer vision and image processing. The basic idea of these methods is to represent the solution of a problem as the minimizer of a cost functional which is designed such that minimizers provides physically meaningful solutions. If the image domain is continuous, energy minimization methods are usually referred to as variational methods. If the image domain is a graph, one is usually speaking of Markov random fields or graphical models. In any case, energy minimization methods can be linked to statistical inference methods by interpreting them as maximum-a-posteriori (MAP) estimates on the posterior probability distribution. Despite their great success in the last decades, energy minimization methods suffer from the drawback that they are often too simple to capture the complex statistics of images. Therefore it is natural to consider more complex models with many free model parameters and learn those parameters from data. Usually, variational and graphical models are solved by iterative methods such as gradient descent. For example, the most successive solvers for large graphical models are based on a block-descent on the dual problem. The idea is now to unroll the iterations of a suitable solver and interpret each iteration of the algorithm as one layer in a deep neural network. Learning can then be achieved by standard backpropagation of the error (of a loss function) through the layers of the network. Compared to black-box deep learning this approach has several advantages. Firstly, the basic structure provided by the variational or graphical model allows to interpret the learned model parameters (e.g. filters, edge weights, ...). Secondly, the number of free model parameters is usually much smaller compared to vanilla deep learning. Thirdly, the deep understanding of variational and graphical models allows to give theoretical guarantees also for the learned models. We will show applications of variational models to the reconstruction of MR images from undersampled data and the application of graphical models to the estimation of stereo and motion.

[1] Y. Chen and T. Pock. Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6):1256–1272, 2016.

[2] K. Hammernik, T. Klatzer, E. Kobler, M. Recht, D. Sodickson, T. Pock, and F. Knoll. Learning a variational network for reconstruction of accelerated MRI data. Magnetic Resonance in Medicine, 2017.

[3] Patrick Knöbelreiter, Christian Reinbacher, Alexander Shekhovtsov, and Thomas Pock. End-to-end training of hybrid CNN-CRF models for stereo. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR),July 2017.

[4] C. Vogel and T. Pock. A primal dual network for low-level vision problems. In German Pattern Recognition Conference (GCPR), 2017.

[5] E. Kobler, T. Klatzer, K. Hammernik, and Thomas Pock. Variational networks: Connecting variational methods and deep learning. In German Pattern Recognition Conference (GCPR), 2017.