2012
The Analysis Sparse Model - Definition, Pursuit, Dictionary Learning, and Beyond
The synthesis-based sparse representation model for signals has drawn a considerable interest in the past decade. Such a model assumes that the signal of interest can be decomposed as a linear combination of a few atoms from a given dictionary. In this talk we concentrate on an alternative, analysis-based model, where an analysis operator -- hereafter referred to as the "Analysis Dictionary" - multiplies the signal, leading to a sparse outcome. While the two alternative models seem to be very close and similar, they are in fact very different. In this talk we define clearly the analysis model and describe how to generate signals from it. We discuss the pursuit denoising problem that seeks the zeros of the signal with respect to the analysis dictionary given noisy measurements. Finally, we explore ideas for learning the analysis dictionary from a set of signal examples. We demonstrate this model's effectiveness in several experiments, treating synthetic data and real images, showing a successful and meaningful recovery of the analysis dictionary.
2011
From SD to HD: Improving Video Sequences Through Super-Resolution
Multi-channel TV broadcast, Internet video and You-Tube, home DVD movies, video conference calls, cellular video calls and more - there is no doubt that videos are abundant and in everyday use. In many cases, the quality of the available video is poor, something commonly referred to as "low-resolution". As an example, High-definition (HD) TV's are commonly sold these days to customers that hope to enjoy a better viewing experience. Nevertheless, most TV broadcast today is still done in standard-definition (SD), leading to poor image quality on these screens. The field of Super-Resolution deals with ways to improve video content to increase optical resolution. The core idea: fusion of the visual content in several images can be performed and this can lead to a better resolution outcome. For years it has been assumed that such fusion requires knowing the exact motion the objects undergo within the scene. Since this motion may be quite complex in general, this stood as a major obstacle for industrial applications. Three years ago a break-through has been made in this field, allowing to bypass the need for exact motion estimation. In this lecture we shall survey the work in this field from its early days (25 years ago) and till very recently, and show the evolution of ideas and results obtained. No prior knowledge in image processing is required.
Sequential Minimal Eigenvalues - An Approach to Analysis Dictionary Learning
Over the past decade there has been a great interest in a synthesis-based model for signals, based on sparse and redundant representations. Such a model assumes that the signal of interest can be decomposed as a linear combination of few columns from a given matrix (the dictionary). An alternative, analysis-based, model can be envisioned, where an analysis operator multiplies the signal, leading to a sparse outcome. In this work we propose a simple but effective analysis operator learning algorithm, where analysis “atoms” are learned sequentially by identifying directions that are orthogonal to a subset of the training data. We demonstrate the effectiveness of the algorithm in several experiments, treating synthetic data and real images, showing a successful and meaningful recovery of the analysis operator.
K-SVD Dictionary-Learning for Analysis Sparse Models
The synthesis-based sparse representation model for signals has drawn a considerable interest in the past decade. Such a model assumes that the signal of interest can be decomposed as a linear combination of a {\em few} atoms from a given dictionary. In this talk we concentrate on an alternative, analysis-based model, where an analysis operator -- hereafter referred to as the "Analysis Dictionary" - multiplies the signal, leading to a sparse outcome. Our goal is to learn the analysis dictionary from a set of signal examples, and the approach taken is parallel and similar to the one adopted by the K-SVD algorithm that serves the corresponding problem in the synthesis model. We present the development of the algorithm steps, which include a tailored pursuit algorithm termed "Backward Greedy" algorithm and a penalty function for the dictionary update stage. We demonstrate its effectiveness in several experiments, treating synthetic data and real images, showing a successful and meaningful recovery of the analysis dictionary.
Exploiting Statistical Dependencies in Sparse Representations
In the commonly used sparse representation modeling, the atoms are assumed to be independent of each other when forming the signal. In this talk we shall introduce a statistical model called Boltzman Machine (BM) that enables such dependencies to be taken into account. Adopting a Bayesian point of view, we first treat the pursuit problem - given a signal, and assuming that the model parameters and the dictionary are known, find its sparse representation. We derive the exact MAP estimation, and show that just like in the independent case, this leads to an exponential search problem. We derive two algorithms for its evaluation: a greedy approximation approach for the general case, and an exact estimation that corresponds to a unitary dictionary and banded interaction matrix. We also consider the estimation of the model parameters, learning these parameters directly from training data. We show that given the signals' representations, this problem can be posed as a convex optimization task by using the Maximum Pseudo-Likelihood (MPL).
2010
A Course on Sparse and Redundant Representation Modeling of Images - Iceland Summer-School
This course (4 lectures and one tutorial) brings the core ideas and achievements made in the field of sparse and redundant representation modeling, with emphasis on the impact of this field to image processing applications. The five lectures (given as PPTX and PDF) are organized as follows:
Lecture 1: The core sparse approximation problem and pursuit algorithms that aim to approximate its solution.
Lecture 2: The theory on the uniqueness of the sparsest solution of a linear system, the notion of stability for the noisy case, guarantees for the performance of pursuit algorithms using the mutual coherence and the RIP.
Lecture 3: Signal (and image) models and their importance, the Sparseland model and its use, analysis versus synthesis modeling, a Bayesian estimation point of view, dictionary learning with the MOD and the K-SVD, global and local image denoising, local image inpainting.
Lecture 4: Sparse representations in image processing - image deblurring, global image separation and image inpainting. using dictionary learning for image and video denoising and inpainting, image scale-up using a pair of learned dictionaries, facial image compression with the K-SVD.
A Course on Sparse and Redundant Representation Modeling of Images - PCMI Summer-School
This course (5 lectures) brings the core ideas and achievements made in the field of sparse and redundant representation modeling, with emphasis on the impact of this field to image processing applications. The five lectures (given as PPTX and PDF) are organized as follows:
Lecture 1: The core sparse approximation problem and pursuit algorithms that aim to approximate its solution.
Lecture 2: The theory on the uniqueness of the sparsest solution of a linear system, the notion of stability for the noisy case, guarantees for the performance of pursuit algorithms using the mutual coherence and the RIP.
Lecture 3: Signal (and image) models and their importance, the Sparseland model and its use, analysis versus synthesis modeling, a Bayesian estimation point of view.
Lecture 4: First steps in image processing with the Sparseland model - image deblurring, image denoising, image separation, and image inpainting. Global versus local processing of images. Dictionary learnong with the MOD and the K-SVD.
Lecture 5: Advanced image processing: Using dicitonary learning for image and video denoising and inpainting, image scale-up using a pair of learned dictionaries, Facial image compression with the K-SVD.
Sparse and Redundant Representation Modeling of Images: Theory and Applications
This survey talk focuses on the use of sparse and redundant representations and learned dictionaries for image denoising and other related problems. We discuss the the K-SVD algorithm for learning a dictionary that describes the image content efficiently. We then show how to harness this algorithm for image denoising, by working on small patches and forcing sparsity over the trained dictionary. The above is extended to color image denoising and inpainting, video denoising, and facial image compression, leading in all these cases to state of the art results. We conclude with more recent results on the use of several sparse representations for getting better denoising performance. An algorithm to generate such set of representations is developed, and our analysis shows that by this we approximate the minimum-mean-squared-error (MMSE) estimator, thus getting better results.
Topics in Minimum-Mean-Squared-Error (MMSE) Estimation in Sparse Approximation
Among the many ways to model signals, a recent approach that draws considerable attention is sparse representation modeling. In this model, the signal is assumed to be generated as a random linear combination of a few atoms from a pre-specified dictionary. In this talk we describe our recent work that analyzed two Bayesian denoising algorithms -- the Maximum-Aposteriori Probability (MAP) and the Minimum-Mean-Squared-Error (MMSE) estimators, under the assumption that the dictionary is unitary. It is well known that both these estimators lead to a scalar shrinkage on the transformed coefficients, albeit with a different response curve. In this talk we start by deriving closed-form expressions for these shrinkage curves and then analyze their performance. Upper bounds on the MAP and the MMSE estimation errors are derived. We tie these to the error obtained by a so-called oracle estimator, where the support is given, establishing a worst-case gain-factor between the MAP/MMSE estimation errors and the oracle's performance.
Single Image Super-Resolution Using Sparse Representation
Scaling up a single image while preserving is sharpness and visual-quality is a difficult and highly ill-posed inverse problem. A series of algorithms have been proposed over the years for its solution, with varying degrees of success. In CVPR 2008, Yang, Wright, Huang and Ma proposed a solution to this problem based on sparse representation modeling and dictionary learning. In this talk I present a variant of their method with several important differences. In particular, the proposed algorithm does not need a separate training phase, as the dictionaries are learned directly from the image to be scaled-up. Furthermore, the high-resolution dictionary is learned differently, by forcing its alignment with the low-resolution one. We show the benefit these modifications bring in terms of simplicity of the overall algorithm, and its output quality.
ההרכב האטומי של תמונות - גישות חדשניות בעיבוד תמונות
פירוק אטומי, הטבלה המחזורית, הרכבה של מולקולה ... כל זה נשמע כמו תחילתה של הרצאה בכימיה. אבל לא! נושאים אלו יעלו בהרצאה שתדון בעיבוד תמונות. עיבוד תמונות הינו תחום מרכזי בחיינו - מהטלוויזיה בבתינו, המצלמה הדיגיטאלית שבכיסנו (ולאחרונה כחלק מהטלפון הסלולרי), דרך צפייה בסרטי די-וי-די, בקרת איכות בפסי ייצור, מערכות עקיבה ואבטחה, ועד צילומי אולטראסאונד, טומוגרפיה ותהודה מגנטית בבתי-חולים. בכל אלה ובעוד מוצרים רבים, עיבוד תמונות מהווה טכנולוגיה שאי-אפשר בלעדיה. תחום זה תוסס ופורח הן בתעשיה והן באקדמיה, עם עשרות אלפי מהנדסים ומדענים בכל רחבי העולם העוסקים בו יום יום ושעה שעה. אז מה זה עיבוד תמונות? זהו הנושא שבו נדון בהרצאה זו. עיבוד תמונות מתייחס לטיפול בתמונות ע"י מחשב. אנו נעסוק בשאלות כגון כיצד תמונה עושה את דרכה אל המחשב, כיצד היא אגורה שם, מה ניתן לעשות בה משכבר היא שם, ועוד. אחת המטרות המרכזיות בהרצאה זו היא הצגת חזית הידע בתחום, ובפרט העשייה המחקרית בטכניון בזירה זו. הדיון הכללי הנ"ל על עיבוד תמונות ישמש אותנו כמצע עליו נבנה כדי להציג את עבודתנו מהעת האחרונה, בה אנו עוסקים במודלים לתמונות המשתמשים ברעיונות "כימיקליים", בעזרתם אנו מטפלים במידע בכלל ובתמונות בפרט. אנו נתאר כיצד ניתן לפרק תמונה לאטומים, לבנות טבלה מחזורית של יסודות לתיאור תמונות, וכיצד אנו רותמים את כל אלה כדי לפתור בעיות מעשיות בתחום, כגון שיפור תמונות וסרטים ותיקונם מקלקולים שונים, השלמת חלקים חסרים בתמונות, דחיסה, ועוד
2009
MMSE Estimation for Sparse Representation Modeling
Cleaning of noise from signals is a classical and long-studied problem in signal processing. Algorithms for this task necessarily rely on an a-priori knowledge about the signal characteristics, along with information about the noise properties. For signals that admit sparse representations over a known dictionary, a commonly used denoising technique is to seek the sparsest representation that synthesizes a signal close enough to the corrupted one. As this problem is too complex in general, approximation methods, such as greedy pursuit algorithms, are often employed. In this line of reasoning, we are led to believe that detection of the sparsest representation is key in the success of the denoising goal. Does this means that other competitive and slightly inferior sparse representations are meaningless? Suppose we are served with a group of competing sparse representations, each claiming to explain the signal differently. Can those be fused somehow to lead to a better result? Surprisingly, the answer to this question is positive; merging these representations can form a more accurate, yet dense, estimate of the original signal even when the latter is known to be sparse. In this talk we demonstrate this behavior, propose a practical way to generate such a collection of representations by randomizing the Orthogonal Matching Pursuit (OMP) algorithm, and produce a clear analytical justification for the superiority of the associated Randomized OMP (RandOMP) algorithm. We show that while the Maximum a-posterior Probability (MAP) estimator aims to find and use the sparsest representation, the Minimum Mean-Squared-Error (MMSE) estimator leads to a fusion of representations to form its result. Thus, working with an appropriate mixture of candidate representations, we are surpassing the MAP and tending towards the MMSE estimate, and thereby getting a far more accurate estimation, especially at medium and low SNR. Another topic covered in thistalk concerns the case of a unitary dictionary. In such a case it is well-known that the MAP estimators has a closed-form and exact solution, and OMP is accurately computing it. Can a similar result be derived for MMSE? We show that this is indeed possible, obtaining a recursive formula that computes the MMSE simply and exactly.
2008
Sparse and Redundant Representation Modeling for Image Processing
In this talk we descibe applications such as image denoising and beyond using sparse and redundant representations. Our focus is on ways to perform these tasks with trained dictonaries using the K-SVD algorithm. As trained dictionaries are limited in handling small image patches, we deploy these within a Bayesian reconstruction procedure by forming an image prior that forces every patch in the resulting image to have a sparse representation.
Super-Resolution-Reconstruction of Image Sequences Without Explicit Motion Estimation
Super-resolution reconstruction proposes a fusion of several low quality images into one higher quality result with better optical resolution. Classic super resolution techniques strongly rely on the availability of accurate motion estimation for this fusion task. When the motion is estimated inaccurately, as often happens for non-global motion fields, annoying artifacts appear in the super-resolved outcome. Encouraged by recent developments on the video denoising problem, where state-of-the-art algorithms are formed with no explicit motion estimation, we seek a super-resolution algorithm of similar nature that will allow processing sequences with general motion patterns. In this talk we base our solution on the Non-Local-Means (NLM) algorithm. We show how this denoising method is generalized to become a relatively simple super-resolution algorithm with no explicit motion estimation. Results on several test movies show that the proposed method is very successful in proproviding super-resolution on general sequences.
A Weighted Average of Sparse Several Representations is Better than the Sparsest One Alone
Cleaning of noise from signals is a classical and long-studied problem in signal processing. Algorithms for this task necessarily rely on an a-priori knowledge about the signal characteristics, along with information about the noise properties. For signals that admit sparse representations over a known dictionary, a commonly used denoising technique is to seek the sparsest representation that synthesizes a signal close enough to the corrupted one. As this problem is too complex in general, approximation methods, such as greedy pursuit algorithms, are often employed. In this line of reasoning, we are led to believe that detection of the sparsest representation is key in the success of the denoising goal. Does this means that other competitive and slightly inferior sparse representations are meaningless? Suppose we are served with a group of competing sparse representations, each claiming to explain the signal differently. Can those be fused somehow to lead to a better result? Surprisingly, the answer to this question is positive; merging these representations can form a more accurate, yet dense, estimate of the original signal even when the latter is known to be sparse. In this talk we demonstrate this behavior, propose a practical way to generate such a collection of representations by randomizing the Orthogonal Matching Pursuit (OMP) algorithm, and produce a clear analytical justification for the superiority of the associated Randomized OMP (RandOMP) algorithm. We show that while the Maximum a-posterior Probability (MAP) estimator aims to nd and use the sparsest representation, the Minimum Mean-Squared-Error (MMSE) estimator leads to a fusion of representations to form its result. Thus, working with an appropriate mixture of candidate representations, we are surpassing the MAP and tending towards the MMSE estimate, and thereby getting a far more accurate estimation, especially at medium and low SNR.
Image Denoising and Beyond via Learned Dictionaries and Sparse Representations
In this survey talk we focus on the use of sparse and redundant representations and learned dictionaries for image denoising and other related problems. We discuss the the K-SVD algorithm for learning a dictionary that describes the image content effectively. We then show how to harness this algorithm for image denoising, by working on small patches and forcing sparsity over the trained dictionary. The aboev is extended to color image denoising and inpainitng, video denosiing, and facial image compression, leading in all these cases to state of the art results. We conclude with very recent results on the use of several sparse representations for getting better denoising performance. An algorithm to generate such set of representations is developed, and our analysis shows that by this method we approximate the minimum-mean-squared-error (MMSE) estimator, thus getting better results.
A Sparse and Non-Negative Solution of Ax=b is Necessarily Unique
An underdetermined linear system of equations Ax = b with non-negativity constraint is considered. It is shown that for matrices A with a row-span intersecting the positive orthant, if this problem admits a sufficiently sparse solution, it is necessarily unique. The bound on the required sparsity depends on a coherence property of the matrix A. It is further shown that A undergoes a conditioning stage with some degrees of freedom, which may be used to improve the coherence measure and strengthen the claimed result.
2007
A Wide-Angle View of Iterated Shrinkage Algorithms For Finding Sparse Representations
Modeling of signals or images by a sparse and redundant representation is shown in recent years to be very effective, often leading to stat-of-the-art results in many applications. Applications leaning on this model can be cast as energy minimization problems, where the unknown is a high-dimensional and very sparse vector. Surprisingly, traditional tools in optimization, including very recently developed interior-point algorithms, tend to perform very poorly on these problems. A recently emerging alternative is a family of techniques, known as "iterated-shrinkage" methods. There are various and different such algorithms, but common to them all is the fact that each of their iterations require a simple forward and inverse transform (e.g. wavelet), and a scalar shrinkage look-up-table (LUT) step. In this talk we shall explain the need for such algorithms, present some of them, and show how they perform on a classic image deblurring problem.
Optimized Projection Directions for Compressed Sensing
Compressed-Sensing (CS) offers a joint compression- and sensing-processes, based on the existence of a sparse representation of the treated signal and a set of projected measurements. Work on CS thus far typically assumes that the projections are drawn at random. In this talk we describe ways to optimize these projections. Two methods are considered: (i) A direct optimization with respect to the CS performance, targeting CS applied with the Orthogonal Matching Pursuit (OMP) or Basis Pursuit (BP), and (ii) A simpler method that targets an average measure of the mutual-coherence of the effective dictionary. As the first approach leads to a complex bi-level optimization task that is hard to handle, we demonstrate the second one and show that it leads to better CS reconstruction performance.
New Results in Image Processing Based on Sparse & Redundant Representations
In this very brief talk I describe the need to model images in general, and then briefly present the Sparse-Land model. The talk includes a demonstration of a sequence of applications in image processing where this model has been deployed successfully, including denoising of still, color and video images, inpainting, and compression. The moral to take home is:
"The Sparse-Land model is a new and promising model that can adapt to many types of data sources. Its potential for medical imaging is an important opportunity that shuold be explored".
2006
Denoising and Beyond via Learned Dictionaires and Sparse Representations
In this talk we consider several inverse problems in image processing, using sparse and redundant representations over trained dictionaries. Using the K-SVD algorithm, we obtain a dictionary that describes the image content effectively. Two training options are considered: using the corrupted image itself, or training on a corpus of high-quality image database. Since the K-SVD is limited in handling small image patches, we extend its deployment to arbitrary image sizes by defining a global image prior that forces sparsity over patches in every location in the image. We show how such Bayesian treatment leads to a simple and effective denoising algorithm for gray-level images with state-of-the-art denoising performance. We then extend these results to color images, handling their denosing, inpainting, and demosaicing. Following the above ideas, with necessary modifications to avoid color artifacts and over-fitting, we present stat-of-the art results in each of these applications. Another extension considered is video denoising -- we demonstrate how the above method can be extened to work with 3D patches, propagate the dictionary from one frame to another, and get both improved denoising performance while also reducing substancially the computational load per pixel.
Super-resolution reconstruction of images - An overview
The super-resolution reconstruction problem addresses the fusion of several low quality images into one higher-resolution outcome. A typical scenario for such a process could be the fusion of several video fields into a higher resolution output that can lead to high quality printout. The super-resolution result provides TRUE resolution, as opposed to the typically used interpolation techniques. The core idea behind this ability is the fact that higher-frequencies exist in the measurements, although in an aliased form, and those can be recovered due to the motion between the frames. Ever since the pioneering work by Tsai and Huang (1984), who demonstrated the core ability to get super-resolution, much work has been devoted by various research groups to this problem and ways to solve it. In this talk I intend to present the core ideas behind the super-resolution (SR) problem, and our very recent results in this field. Starting form the problem modeling, and posing the super-resolution task as a general inverse problem interpretation, we shall see how the SR problem can be addressed effectively using ML and later MAP estimation methods. This talk also show various ingredients that are added to the reconstruction process to make it robust and efficient. Many results will accompany these descriptions, so as to show the strengths of the methods.
Sparse & Redundant Signal Representation, and its Role in Image Processing
In signal and image processing, we often use transforms in order to simplify operations or to enable better treatment to the given data. A recent trend in these fields is the use of overcomplete linear transforms that lead to a sparse description of signals. This new breed of methods is more difficult to use, often requiring more computations. Still, they are much more effective in applications such as signal compression and inverse problems. In fact, much of the success attributed to the wavelet transform in recent years, is directly related to the above-mentioned trend. In this talk we will present a survey of this recent path of research, and its main results. We will discuss both the theoretic and the applicative sides to this field. No previous knowledge is assumed (... just common sense, and little bit of linear algebra).
Image Denoising via Learned Dictionaries and Sparse Representations
We address the image denoising problem, where zero mean white and homogeneous Gaussian additive noise should be removed from a given image. The approach taken is based on sparse and redundant representations over a trained dictionary. The proposed algorithm denoises the image, while simultaneously trainining a dictionary on its (corrupted) content using the K-SVD algorithm. As the dictionary training algorithm is limited in handling small image patches, we extend its deployment to arbitrary image sizes by defining a global image prior that forces sparsity over patches in every location in the image. We show how such Bayesian treatment leads to a simple and effective denoising algorithm, with state-of-the-art performance, equivalent and sometimes surpassing recently published leading alternative denoising methods.
Iterated Shrinkage Algorithm for Basis Pursuit Minimization
Recently, three independent works suggested
iterated shrinkage algorithms that generalizes the classic
work by Donoho and Johnston. Daubechies, Defrise and De-Mol
developed such an algorithm for deblurring, using the
concept of surrogate functions.
Figueirido and Nowak used the EM algorithm to construct such
algorithm, and later extended their work by using the
bound-optimization technique. Elad developed such an
algorithm based on a parallel coordinate descent (PCD) point
of view. In this talk we describe these methods with an
emphasis on the later, and demonstrate how it can be of
importance for the minimization of a general basis pursuit
penalty function. As such, the proposed algorithms form a
new pursuit technique, falling in between the basis pursuit
and the matching pursuit.
Image Decomposition and Inpainting by Sparse Representations
In this talk we present a novel method for separating images into texture and piecewise smooth parts, and show how this formulation can also lead to image inpainting. Our separation and inpainting processes are based on sparse and redundant representations of the two contents - cartoon and texture - over different dictionaries. Using the Basis Pursuit Denoising (BPDN) to formulate the overall penalty function, we achieve a separation of the image, denoising, and inpainting. In fact, with a small modification, the damn thing can make coffee.
2005
Shrinkage for Redundant Representations?
Shrinkage is a well known and appealing denoising technique. The
use of shrinkage is known to be optimal for Gaussian white noise,
provided that the sparsity on the signal's representation is
enforced using a unitary transform. Still, shrinkage is also
practiced successfully with non-unitary, and even redundant
representations. In this lecture we shed some light on this
behavior. We show that simple shrinkage could be interpreted as
the first iteration of an algorithm that solves the basis pursuit
denoising (BPDN) problem. Thus, this work leads to a sequential
shrinkage algorithm that can be considered as a novel and
effective pursuit method. We demonstrate this algorithm, both
synthetically, and for the image denoising problem, showing in
both cases its superiority over several popular alternatives..
Example-Based Priors for Inverse Problems in Image Processing
In our field, when composing a
super-resolved image, two ingredients contribute to the
ability to get a leap in resolution:
(i) the existence of many and diverse measurements, and
(ii) the availability of a model to reliably describe the
image to be produced.
This second part, also known as the regularization or the
prior, is of generic importance, and could be deployed to
any inverse problem, and used by many other applications
(compression, image synthesis, and more). The well-known
recent work by Baker and Kanade ('02) and the work that
followed (Lin and Shum '04, Robinson and Milanfar '05) all
suggest that while the measurements are limited in gaining a
resolution increase, the prior could be used to break this
barrier. Clearly, the better the prior used, the higher the
quality we can expect from the overall reconstruction
procedure. Indeed, recent work on super-resolution (and
other inverse problems) departs from the regular Tikhonov
method, and tends to the robust counterparts, such as TV or
the bilateral prior (see Farsiu et. al. '04).
A recent trend with a growing popularity is the use of
examples in defining the prior. Indeed, Baker and Kanade
were the first to introduce this notion to the
super-resolution task. There are several ways to use
examples in shaping the prior to become better. The work by
Mumford and Zhu ('99) and the follow-up contribution by
Haber and Tenorio (02') suggest a parametric approach. Baker
and Kanade ('02), Freeman et. al. (several contributions
'01), Nakagaki and Katzaggelos ('03) all use the examples
to directly learn the reconstruction function, by observing
low-res. versus high-res. pairs.
In this talk we survey this line of work and show how it can
be extended in several important ways. We show a general
framework that builds an example-based prior that is
independent of the inverse problem at hand, and we
demonstrate it on several such problems, with promising
results.
The K-SVD: Design of Dictionaries for Sparse and Redundant Representation of Signals
In recent years there is a growing interest in the study of sparse representation for signals. Using an overcomplete dictionary that contains prototype signal-atoms, signals are described as sparse linear combinations of these atoms. Recent activity in this field concentrated mainly on the study of pursuit algorithms that decompose signals with respect to a given dictionary. Designing dictionaries to better fit the above model can be done by either selecting pre-specified transforms, or by adapting the dictionary to a set of training signals. Both these techniques have been considered in recent years, however this topic is largely still open. In this preentation we address the latter problem of designing dictionaries, and introduce the K-SVD algorithm for this task. We show how this algorithm could be interpreted as a generalization of the K-Means clustering process, and demonstrate its behavior in both synthetic tests and in applications on real data. The accompanying paper also describes its generalization to nonnegative matrix factorization problem that suits signals generated under an additive model with positive atoms.
Sparsity and Overcomplete Data Representation
A recent trend in signal, image, and data analysis is the use of overcomplete linear transforms that lead to a sparse description of the processed data. This new breed of methods is more difficult to use, but they are much more effective in applications such as data compression and solution of inverse problems. In this talk we present a wide angle view to this recent path of research.
Retinex By Two Bilateral Filters
Retinex theory deals with the removal of unfavorable illumination effects from images. This ill-posed inverse problem is typically regularized by forcing spatial smoothness on the recoverable illumination. Recent work in this field suggested exploiting the knowledge that the illumination image bounds the image from above, and the fact that the reflectance is also expected to be smooth.
In this lecture we show how the above model can be improved to provide a non-iterative retinex algorithm that handles better edges in the illumination, and suppresses noise in dark areas. This algorithm uses two specially tailored bilateral filters -- the first evaluates the illumination and the other is used for the computation of the reflectance. This result stands as a theoretic justification and refinement for the recently proposed heuristic use of the bilateral filter for retinex by Durand and Dorsey. In line with their appealing way of speeding up the bilateral filter, we show that similar speedup methods apply to our algorithm.
Recent Trends in Signal Representations and their Role in Image Processing
In signal and image processing, we often use transforms in order to simplify operations or to enable better treatment to the given data. A recent trend in these fields is the use of overcomplete linear transforms that lead to a sparse description of signals. This new breed of methods is more difficult to use, often requiring more computations. Still, they are much more effective in applications such as signal compression and inverse problems. In fact, much of the success attributed to the wavelet transform in recent years, is directly related to the above-mentioned trend.
In this talk we will present a survey of this recent path of research, its main results, and the involved players and their contributions. We will discuss both the theoretic and the applicative sides to this field. No previous knowledge is assumed.
2004
Sparse Representations of Signals - Theory and Applications
Transforming signals is typically done in order to simplify their representations. Among the many ways to do so, the use of linear combinations taken over a redundant dictionary is appealing due to both its simplicity and its diversity. Choosing the sparsest of all solutions aligns well with our desire for a simple signal description, and this also leads to uniqueness. Since the search for the sparsest representation is NP-hard, methods such as the Basis-Pursuit (BP) and the Matching Pursuit (MP) have been proposed in the mid 90's to approximate the desired sparse solution.
The pioneering work by Donoho and Huo ('99) started a sequence of research efforts, all aiming to theoretically understand the quality of approximations obtained by the pursuit algorithms, and the limits to their success. A careful study established that both BP and MP algorithms are expected to lead to the sparsest of all representations if indeed such solution is sparse enough. Later work generalized these results to the case where error is allowed in the representation. Very recent results addressed the same analysis from a probabilistic point of view, finding bounds on the average performance, and showing a close resemblance to empirical evidence.
All these results lead to the ability to use the pursuit algorithms with clear understanding of their expected behavior, in what Stanley Osher would have called "emotionally uninvolved" manner. This paves the way for future transforms that will be based on (i) overcomplete (redundant) representations, (ii) linear in constructing signals, and non-linear in their decomposition, and (iii) sparsity as their core force. Furthermore, as signal transforms, signal compression, and inverse problems, are all tangled together, we are now armed with new and effective tools when addressing many problems in signal and image processing.
In this talk we present a survey of this recent path of research, its main results, and the involved players and their contributions. We will discuss both the theoretic and the applicative sides to this field. No previous knowledge is assumed.
The several work pieces presented are joint with:
1. Alfred M. Bruckstein - CS department, Technion Israel,
2. David Donoho, Statistics- Stanford University,
3. Vladimir Temlyakov - Math department, University of south Carolina, and
4. Jean-Luc Starck - Service d'Astrophysique CEA/Saclay, France.
Over-complete and Sparse Representations for Image Decomposition and Inpainting
In this talk we present a novel method for separating images into texture and piecewise smooth parts, and show how this formulation can also lead to image inpainting. Our separation and inpainting process exploits both the variational and the sparsity mechanisms, by combining the Basis Pursuit Denoising (BPDN) algorithm and the Total-Variation (TV) regularization scheme.
The basic idea in this work is the use of two appropriate dictionaries, one for the representation of textures, and the other for the natural scene parts, assumed to be piece-wise-smooth. Both dictionaries are chosen such that they lead to sparse representations over one type of image-content (either texture or piecewise smooth). The use of the BPDN with the two augmented dictionaries leads to the desired separation, along with noise removal as a by-product. As the need to choose a proper dictionary for natural scene is very hard, a TV regularization is employed to better direct the separation process.
This concept of separation via sparse and over-complete representation of the image is shown to have a direct and natural extension to image inpainting. When some of the pixels in known locations in the image are missing, the same separation formulation can be changed to fit the problem of decomposing the image while filling in the holes. Thus, as a by-product of the separation we achieve inpainting. This approach should be compared to a recently published inpainting system by Bertalmio, Vese, Sapiro, and Osher. We will present several experimental results that validate the algorithm's performance.
Joint work with Jean-Luc Starck from CEA, France, and David Donoho, Statistics - Stanford University.
Image Decomposition Via the Combination of Sparse Representations and a Variational Approach
The separation of image content into semantic parts plays a vital role in applications such as compression, enhancement, restoration, and more. In recent years several pioneering works suggested such separation based on variational formulation, and others using independent component analysis and sparsity. In this talk we present a novel method for separating images into texture and piecewise smooth parts, exploiting both the variational and the sparsity mechanisms, by combining the Basis Pursuit Denoising (BPDN) algorithm and the Total-Variation (TV) regularization scheme.
The basic idea in our work is the use of two appropriate dictionaries, one for the representation of textures, and the other for the natural scene parts, assumed to be piece-wise-smooth. Both dictionaries are chosen such that they lead to sparse representations over one type of image-content (either texture or piecewise smooth). The use of the BPDN with the two augmented dictionaries leads to the desired separation, along with noise removal as a by-product. As the need to choose a proper dictionary for natural scene is very hard, a TV regularization is employed to better direct the separation process. We will present several experimental results that validate the algorithm's performance.
Joint work with Jean-Luc Starck from CEA, France, and David Donoho, Statistics- Stanford University.
2003
Rejection Based Classifier for Target Detection In Images
One of the most fundamental problems in the treatment of high-dimensional data is classification of a cloud of points in R^D into several sub-classes based on training data. An important such task is the pattern detection problem in images, which requires a separation between 'Target' and 'Clutter' classes, where every instance of a pattern in each of these classes appears as a sequence of D pixels. In most cases, the following properties hold true for the target detection task: (i) the probability of the 'Target' class is substantially smaller compared to that of the 'Clutter' ; (ii) the volume occupied by the target class in R^D is far smaller than that held by the clutter set ; and (iii) The target set is either convex or can be divided to several sub-sets each being convex.
In this talk we describe a new classifier that exploits these properties, yielding a low complexity yet effective target detection algorithm. This algorithm, called the Maximal Rejection Classifier (MRC), is based on successive rejection operations. Each such rejection stage is performed using a linear projection followed by thresholding. The projection direction is designed to maximize the number of rejected 'Clutter' points from further consideration. An application of detecting frontal and vertical faces in images is demonstrated using the MRC with encouraging results.
Joint work with Yacov Hel-Or, Professor at the Interdisciplinary Center (IDC), Herzlia, Israel and Renato Keshet, Hewlett-Packard Laboratories, Israel.
Analysis of the Basis Pursuit Algorithm and the Separation of Points, Lines, and Planes
When over-complete dictionary is used, the search for the sparsest of all representations amounts to the need to solve a highly complicated combinatorial problem, with complexity growing exponenentially with the number of atoms in the dictionary. The Basis Pursuit algorithm (BP) (by Chen, Donoho, and Saundrers, 1995) was shown to be a highly effective tool for approximating the solution for the above problem.
In this talk we describe a fascinating analysis that explains the surprising empirical success of the PB. We start by presenting Donoho and Huo's results, and then show how these results could be further improved. We first discuss the case of pair of ortho-bases and show tighter bounds on the required sparsity of the signal representation that guarantees BP-optimality. We then turn to extend previous results for arbitrary dictionaries, showing that all previous work falls as a special case of the new theory. Finally, we show how the obtained analysis results can be used to solve the problem of separating a volume of voxels into a sparse set of points, lines, and planes.
Joint work with Alfred M. Bruckstein – CS department, Technion Israel and David Donoho, Statistics - Stanford University.
2002
Sparse Representations and The Basis Pursuit Algorithm
Transforming a signal is typically done in order to simplify its representation. Among the many ways to represent signals, the use of a linear combination taken from an over-complete dictionary is appealing due to its ability to cover a diversity of signal behaviors in one transform. However, with this benefit comes the problem of non-unique representation. Choosing the sparsest of all representations aligns well with our desire for simple signal description, but searching such representation becomes a hard to solve non-convex and highly non-smooth optimization problem. The "Basis-Pursuit" (BP) algorithm (Chen, Donoho, and Saunders - 1995) is a stable approximate solver for the above task, replacing the non-convex L_0 minimization with a L_1 norm. A later work by Donoho and Huo (1998) theoretically proved that for specific dictionaries built from pair of ortho-bases and for sparse enough representations, the BP algorithm is indeed optimal.
In this talk we start by describing Donoho and Huo's work on the BP algorithm, and then show how these results could be further improved. We first discuss the case of pair of ortho-bases and show tighter bounds on the required sparsity of the signal representation that guarantees BP-optimality. We then turn to extend previous results for arbitrary dictionaries, showing that all previous work falls as a special case of the new theory. Finally, we show work in progress that relates the Basis Pursuit algorithm used as a non-linear filtering to the Bayesian approach. We show how TV, Wavalet denoising, general Bayesian priors, and specifically the bilateral filter are all special cases of the Basis Pursuit for difference choices of dictionaries.
Joint work with Alfred M. Bruckstein – CS department, Technion Israel, David Donoho, Statistics - Stanford University, and Peyman Milanfar, Electrical Engineering - University of California - Santa-Cruz.
Fast Polar Fourier Transform
In a wide range of applied problems of 2D and 3D imaging - including Tomography, image rotation, image registration and more - a common continuum formulation of the problem places great emphasis on obtaining and manipulating the Fourier transform in polar coordinates. However, the translation of continuum ideas into practical work with modern digital data sampled on a Cartesian grid is problematic. It is widely believed that "there is no Polar Fast Fourier Transform" and that for practical work, continuum ideas are at best a source of inspiration rather than a practical tool.
In this talk we describe a fast and highly accurate Polar-FFT. A special feature of the proposed approach is that it involves only 1D equi-spaced FFT's. In particular, there is no role for notions like "re-gridding" and "scattered-data-interpolation", as might have been supposed from the existing literature in related applications fields such as Tomography. A central tool in our approach is the Pseudo-Polar FFT, an FFT where the evaluation frequencies lie in an oversampled set of non-angularly equispaced points. In the talk we describe the concept of Pseudo-Polar domain (proposed originally by Averbuch, Coifman, Donoho, Israeli and Walden at 1998), including fast forward and inverse transform, a quasi-Parseval relation, and provide empirical and theoretical analysis of the Gram operator of the Pseudo-Polar FFT. For those interested primarily in Polar FFT's, Pseudo-Polar FFT's play the role of a halfway point - a nearly polar system from which conversion to Polar Coordinates uses processes relying purely on 1D operations. We illustrate a number of applications of our Polar and Pseudo-Polar FFT's.
Joint work with David Donoho, Statistics- Stanford University, Amir Averbuch, Mathematics Department - Tel-Aviv University, Ronald Coifman, Mathematics Department – Yale, and Moshe Israeli – CS department – The Technion.
Shape From Moments - An Estimation Perspective
This talk discusses the problem of recovering a planar polygon from its measured moments. The moments correspond to an indicator function defined over the polygon's support. Previous work on this problem gave necessary and sufficient conditions for such successful recovery process and focused mainly on the case of exact measurements being given. In this talk we describe an extension of these results treating the same problem in the case where a longer than necessary series of noise corrupted moments is given.
Leaning on similar problems in array processing, system identification, and signal processing, we discuss a set of possible estimation procedures which are based on the Prony and the Pencil methods, relate them one to the other, and compare them through simulations. We then present an improvement over these methods based on the direct use of the Maximum-Likelihood estimator, exploiting the above methods as initialization. Finally, we show how regularization, and thus Maximum A-posteriori Probability estimator could be applied to reflect prior knowledge about the recovered polygon.
Joint work with Peyman Milanfar, Professor at Electrical Engineering, University of California - Santa-Cruz, and Gene Golub, Professor at Stanford University - Computer Science Department.
On the Bilateral Filter and Ways to Improve It
Additive noise removal from a given signal (also known as de-noising) is an important stage in many applications in signal processing. Various approaches have been proposed throughout the years. This talk focuses on Bayesian smoothing and edge-preserving methods. Classical algorithms in this family are typically based on Weighted Least Squares (WLS), Robust Estimation (RE), and Anisotropic Diffusion (AD). These methods share common features such as adaptivity to the data, formation as optimization problems, and the need for iterative-based restoration. In 1998 Tomasi and Manduchi (CS, Stanford) proposed an alternative heuristic non-iterative filter for noise removal called the bilateral filter. It was shown to give similar and possibly better results compared to the abovementioned iterative approaches.
However, the bilateral filter was proposed as an intuitive tool without theoretical connection to the classical approaches. In this talk the various noise-removal techniques discussed here (WLS, RE, AD, and the bilateral filter) are presented and related theoretically to each other. In particular, it is shown that RE (and AD) could be interpreted as WLS with weights replaced after each iteration. Also, it is shown that the bilateral filter emerges from the Bayesian approach as a single iteration of the Jacobi iterative algorithm for a properly posed smoothness penalty. Based on this observation, it is shown how this new filter can be improved and extended to treat more general reconstruction problems.
Static and Dynamic Super-Resolution
The super-resolution reconstruction process deals with the fusion of several low quality and low-resolution images into one higher-resolution and possibly better final image. We start by showing that from theoretic point of view, this fusion process is based on generalized sampling theorems due to Yen (1956) and Papulis (1977). When more realistic scenario is considered with blur, arbitrary motion, and additive noise, an estimation approach is considered instead.
We describe methods based on the Maximum-Likelihood (ML), Maximum-A-posteriori Probability (MAP), and the Projection onto Convex Sets POCS) as candidate tools to use. Underlying all these methods is the development of a model describing the relation between the measurements (low-quality images) and the desired output (high-resolution image). Through this path we presents the basic rational behind super-resolution, and then present the dichotomy between the static and the dynamic super-resolution process. We proposed treatment of both, and deal with several interesting special cases.
Rejection Based Classification and its Applications
The target detection problem is defined as the need to separate targets from clutter instances. Among the many example-based techniques for the solution of this problem, the family of rejection-based classifiers are consistently exhibiting state-of-the-art accuracy while being the fastest. This rejection-based approach advocates the use of large set of weak-classifiers chained sequentially. After application of each such atom-block, a rejection of some of the clutter is performed while guaranteeing no loss of targets.
While intuitively appealing, theoretic background for this method was gathered only recently. Some roots of it can be traced to the boosting algorithm and the decision tree methods - two wide fields of research in machine learning that concentrate on using multiple weak-classifiers for the construction of a complicated overall machine. Rejection as a concept was proposed and analyzed by Nayar and Baker, with emphasis on the multi-class problems. More recently Elad, Hel-Or, and Keshet proposed the Maximal-rejection-Classifier (MRC), and employed it to the face detection problem. To conclude this list of works on the rejection-based idea, we should mention the work of Viola and Johns on the face detection problem using sub-linear weak-classifiers joined via boosting. In this talk I'll survey the various contributions to the rejection idea and its efficient implementation on face detection problem.
Rejection Based Classifier for Face Detection
Pattern detection problems require a separation between two classes, 'Target' and 'Clutter', where the probability of the former is substantially smaller than that of the latter. We describe a new classifier that exploits this property, yielding a low complexity yet effective target detection algorithm. This Maximal Rejection Classifier (MRC) algorithm is based on successive rejection operations. Each rejection stage is performed using a linear projection followed by thresholding. The projection direction is designed to maximize the number of 'Clutter' points rejected from further consideration. An application of detecting frontal and vertical faces in images is demonstrated using the MRC, with encouraging results.
Joint work with Yacov Hel-Or, Professor at the Interdisciplinary Center (IDC), Herzlia, Israel and Renato Keshet, Hewlett-Packard Laboratories, Israel.