Technical Report PHD-2011-02

Title: Processing Image Sequences Without Motion Estimation
Authors: Matan Protter
Supervisors: Michael Elad
Abstract: Digital image restoration is a prominent field in signal processing, focusing on improving the quality of images suffering from various degradation effects such as noise and blur. Performing the restoration usually requires modelling the image content in order to separate the true image content from the degradation effects and restoring the degradation-free content.

Restoration of image sequences can obtain better results compared to restoring each image individually, provided the temporal redundancy is adequately used. Most image sequence processing algorithms rely on estimating the motion between the frames in order to be able to merge the data from various frames.

However, in most sequences, the motion patterns are very complex, and as a result motion estimation, a severely under-determined problem, tends to be error-prone and inaccurate. Thus, algorithms relying on motion estimation tend to reduce to single image processing in areas of the image containing these complex motion patterns. Unfortunately, as most sequences indeed exhibit mostly complex motion patterns, relying on motion estimation is not able to fully exploit the benefits of having multiple frames of the same scene.

In the last several years there has been a trend of circumventing motion estimation in the denoising of image sequences. Such a feat has been made possible by the emergence of powerful image models, extended to model image sequences as well. We propose a contribution along these lines, extending the model of sparse and redundant representations to the denoising of image sequences. We show that state-of-the-art results are obtained with this method, indeed proving that motion estimation can be avoided for denoising of image sequences.

Another restoration field relying on motion estimation is super-resolution, in which several images of the same scene are merged into a high-quality image (or sequence) of the scene. Each image offers a different sampling of the scene (assuming the sequence indeed contains motion), and motion estimation is used to merge the images. Even more than in denoising, this procedure requires very high accuracy, which is not possible in the majority of sequences.

Relying on the intuition gained from denoising sequences while avoiding explicit motion estimation, we offer two different approaches to achieve the same feat in super-resolution. This is done by relying on crude, probabilistic motion estimation to replace the explicit one. We show that this alternative path is indeed able to successfully handle sequences previously considered outside the realm of super-resolution due to their complicated motion patterns.

We conclude by revisiting the denoising problem, focusing on signals obeying the sparse and redundant modelling of signals. It has been shown that averaging several sparse representations achieves better denoising than the sparsest representation alone. This has been explained by relating these two solutions to approximations of the MMSE and MAP estimators, respectively. In general, both MAP and MMSE cannot be computed directly. We show that in the special case where the dictionary is unitary, both estimators enjoy a closed-form formula, with the MMSE out-performing the MAP in this case as well.

CopyrightThe above paper is copyright by the Technion, Author(s), or others. Please contact the author(s) for more information

Remark: Any link to this technical report should be to this page (, rather than to the URL of the PDF files directly. The latter URLs may change without notice.

To the list of the PHD technical reports of 2011
To the main CS technical reports page

Computer science department, Technion