IST改進算法之Two-Step Iterative Shrinkage/Thresholding(TwIST)

題目:IST改進算法之Two-Step Iterative Shrinkage/Thresholding(TwIST)

        本篇介紹IST的一種改進算法TwIST,該改進算法由以下文獻提出:

Bioucas-DiasJ M, Figueiredo M AT.A new TwIST: two-step iterative shrinkage/thresholding algorithms for imagerestoration[J]. IEEE Transactions on Image processing, 2007, 16(12): 2992-3004.

        據文獻介紹,相比於IST,TwIST旨在使算法收斂的更快。

        TwIST解決的問題是目標函數爲式(1)的優化問題:

注意:最後一行無窮右邊的中括號並沒有錯,這表示開區間,而0左邊的中括號表示閉區間。

1、TwIST算法內容

        相比於IST,TwIST主要是迭代公式的變化:

其中Ψλ定義爲:

Ψλ的表達式與式(1)中的正則項Φ(x)有關,當Φ(x)=||x||1時:

即軟閾值函數。更一般的,Φ(x)爲式(10)時:

例如文中提到的當p=2時:

因此

2、TwIST算法的提出思路

        TwIST實際上是IST與TwSIM的結合:

注意式(13)是推廣的IST迭代公式,原始版的IST迭代公式是令式(13)中的β=1。而TwSIM實際上是式(14)一種實現形式(當矩陣At較大時)

其中式(14)如下(即IterativeRe-Weighted Shrinkage (IRS)):

        如何將IST和TwSIM結合得到TwIST的呢?這個過程有點懸:

注意這個結合過程,文中說:若令式(16)中α=1並將C-1替換爲ψλ則可以得到式(13)。由此相似性可以得到一個two-step版本的IST:(這個idea來的好突然)

        既然TwIST結合了IST和TwSIM,因此TwIST算法希望能達到如下特性:

        爲什麼叫two-step  IST呢?

比較一下IST迭代公式式(13)和TwIST迭代公式式(18)可以發現,IST迭代公式等號右邊只有xt,而TwIST迭代公式等號右邊有xtxt-1

3、TwIST算法的參數設置

        TwIST算法主頁:http://www.lx.it.pt/~bioucas/TwIST/TwIST.htm

        在TwIST主頁可以下載到最新版本TwIST的Matlab實現代碼。

        首先,這裏重點介紹一下TwIST迭代公式(18)中α和β兩個關鍵參數如何設置?文中式(22)和式(23)給出了α和β的計算方法,相關公式如下圖:

        在TwIST官方代碼中,與參數α和β有關的代碼如圖所示(Notepad++打開截圖):

        

        

        

        

        參數α和β可由外部輸入(參見第265~268行),若未輸入,則默認按第319~326行計算(其中在第235~236行已將α和β置零,若第265~268行未賦值,則執行第319~326行),參數α計算公式與文中的式(22)看來不一樣,但其實是相同的,推導如下:

        先將式(20)代入式(22)化簡一下:

程序中的lam1即爲ξ1(可選輸入參數,參見程序第263~264行,默認爲10-4,參見程序第242行),lamN即爲ξm(恆爲1,參見程序第242行),而κ1m,因此程序中的rho0

將rho0代入

因此文中式(22)和程序中的設置是一樣的。文中又提到TwIST對參數α和β並不是很敏感,也就是“very robust”:

4、TwIST算法的MATLAB代碼

        官方版本的TwIST算法Matlab代碼超過700行,其實最核心的就是在一定條件下循環迭代式(18)而已,這裏給出一個本人寫的簡化版本的TwIST代碼,方便理解:

function [ x ] = TwIST_Basic( y,Phi,lambda,beta,alpha,epsilon,loopmax,lam1,lamN )
%   Detailed explanation goes here  
%Version: 1.0 written by jbb0523 @2016-08-12
    if nargin < 9
        lamN = 1;
    end
    if nargin < 8
        lam1 = 1e-4;
    end
    if nargin < 7    
        loopmax = 10000;    
    end    
    if nargin < 6      
        epsilon = 1e-2;      
    end
    if nargin < 5
        rho0 = (1-lam1/lamN)/(1+lam1/lamN);
        alpha = 2/(1+sqrt(1-rho0^2));
    end
    if nargin < 4
        beta  = alpha*2/(lam1+lamN);
    end
    if nargin < 3      
        lambda = 0.1*max(abs(Phi'*y));     
    end  
    [y_rows,y_columns] = size(y);      
    if y_rows<y_columns      
        y = y';%y should be a column vector      
    end 
    soft = @(x,T) sign(x).*max(abs(x) - T,0);
    n = size(Phi,2);    
    loop = 0; 
    fprintf('\n');
    x_pre2 = zeros(n,1);%Initialize x0=0 
	%Initialize x1=soft(x0+Phi'*(y-Phi*x0),lambda)
    x_pre1 = soft(x_pre2+Phi'*(y-Phi*x_pre2),lambda);
    f_pre1 = 0.5*(y-Phi*x_pre1)'*(y-Phi*x_pre1)+lambda*sum(abs(x_pre1));
    while 1
        %(1)compute x_(t+1)
        x_temp = soft(x_pre1+Phi'*(y-Phi*x_pre1),lambda);
        %following 3 lines are important for reconstruction successfully
        mask = (x_temp ~= 0);
        x_pre2 = x_pre2.* mask;
        x_pre1 = x_pre1.* mask;
        x=(1-alpha)*x_pre2+(alpha-beta)*x_pre1+beta*x_temp;
        %(2)update x_(t-1)
        x_pre2 = x_pre1;
        f_pre2 = f_pre1;
        %(3)update x_t
        x_pre1 = x;
        f_pre1 = 0.5*(y-Phi*x_pre1)'*(y-Phi*x_pre1)+lambda*sum(abs(x_pre1));
        loop = loop + 1; 
        %fprintf('Iter %d:f_pre1 = %f\n',loop,f_pre1);
        if abs(f_pre1-f_pre2)/f_pre2<epsilon
            fprintf('abs(f_pre1-f_pre2)/f_pre2<%f\n',epsilon);
            fprintf('TwIST loop is %d\n',loop);
            break;
        end
        if loop >= loopmax
            fprintf('loop > %d\n',loopmax);
            break;
        end
        if norm(x_pre1-x_pre2)<epsilon
            fprintf('norm(x-x_pre)<%f\n',epsilon);
            fprintf('TwIST loop is %d\n',loop);
            break;
        end
    end
  end


值得注意的是,本基本版TwIST_Basic僅解決當文中式(1)正則項Φ(x)=||x||1時的情況,而官方版本的TwIST通過參數配置可以解決當Φ(x)爲其它表達式時的情況。

        這裏最核心的是第40~45行,即:

        x_temp = soft(x_pre1+Phi'*(y-Phi*x_pre1),lambda);
        %following 3 lines are important for reconstruction successfully
        mask = (x_temp ~= 0);
        x_pre2 = x_pre2.* mask;
        x_pre1 = x_pre1.* mask;
        x=(1-alpha)*x_pre2+(alpha-beta)*x_pre1+beta*x_temp;

即執行TwIST迭代公式(18)的過程。值得注意的是起初本人並沒有加入第42~44行,當然,也基本無法成功重構,但官方版本的TwIST卻可以重構成功,於是本人仔細對比了與TwIST的差別,最終發現關鍵點在官方版本的以下幾行:

        

於是在我的基本版本里新加入了即第42~44行,重構效果明顯改善。

        爲了保證本篇的完整性,這裏也同時也給出官方版本的TwIST算法Matlab代碼:

        (TwIST算法主頁:http://www.lx.it.pt/~bioucas/TwIST/TwIST.htm

function [x,x_debias,objective,times,debias_start,mses,max_svd] = ...
         TwIST(y,A,tau,varargin)
%
% Usage:
% [x,x_debias,objective,times,debias_start,mses] = TwIST(y,A,tau,varargin)
%
% This function solves the regularization problem 
%
%     arg min_x = 0.5*|| y - A x ||_2^2 + tau phi( x ), 
%
% where A is a generic matrix and phi(.) is a regularizarion 
% function  such that the solution of the denoising problem 
%
%     Psi_tau(y) = arg min_x = 0.5*|| y - x ||_2^2 + tau \phi( x ), 
%
% is known. 
% 
% For further details about the TwIST algorithm, see the paper:
%
% J. Bioucas-Dias and M. Figueiredo, "A New TwIST: Two-Step
% Iterative Shrinkage/Thresholding Algorithms for Image 
% Restoration",  IEEE Transactions on Image processing, 2007.
% 
% and
% 
% J. Bioucas-Dias and M. Figueiredo, "A Monotonic Two-Step 
% Algorithm for Compressive Sensing and Other Ill-Posed 
% Inverse Problems", submitted, 2007.
%
% Authors: Jose Bioucas-Dias and Mario Figueiredo, October, 2007.
% 
% Please check for the latest version of the code and papers at
% www.lx.it.pt/~bioucas/TwIST
%
% -----------------------------------------------------------------------
% Copyright (2007): Jose Bioucas-Dias and Mario Figueiredo
% 
% TwIST is distributed under the terms of 
% the GNU General Public License 2.0.
% 
% Permission to use, copy, modify, and distribute this software for
% any purpose without fee is hereby granted, provided that this entire
% notice is included in all copies of any software which is or includes
% a copy or modification of this software and in all copies of the
% supporting documentation for such software.
% This software is being provided "as is", without any express or
% implied warranty.  In particular, the authors do not make any
% representation or warranty of any kind concerning the merchantability
% of this software or its fitness for any particular purpose."
% ----------------------------------------------------------------------
% 
%  ===== Required inputs =============
%
%  y: 1D vector or 2D array (image) of observations
%     
%  A: if y and x are both 1D vectors, A can be a 
%     k*n (where k is the size of y and n the size of x)
%     matrix or a handle to a function that computes
%     products of the form A*v, for some vector v.
%     In any other case (if y and/or x are 2D arrays), 
%     A has to be passed as a handle to a function which computes 
%     products of the form A*x; another handle to a function 
%     AT which computes products of the form A'*x is also required 
%     in this case. The size of x is determined as the size
%     of the result of applying AT.
%
%  tau: regularization parameter, usually a non-negative real 
%       parameter of the objective  function (see above). 
%  
%
%  ===== Optional inputs =============
%  
%  'Psi' = denoising function handle; handle to denoising function
%          Default = soft threshold.
%
%  'Phi' = function handle to regularizer needed to compute the objective
%          function.
%          Default = ||x||_1
%
%  'lambda' = lam1 parameters of the  TwIST algorithm:
%             Optimal choice: lam1 = min eigenvalue of A'*A.
%             If min eigenvalue of A'*A == 0, or unknwon,  
%             set lam1 to a value much smaller than 1.
%
%             Rule of Thumb: 
%                 lam1=1e-4 for severyly ill-conditioned problems
%                 lam1=1e-2 for mildly  ill-conditioned problems
%                 lam1=1    for A unitary direct operators
%
%             Default: lam1 = 0.04.
%
%             Important Note: If (max eigenvalue of A'*A) > 1,
%             the algorithm may diverge. This is  be avoided 
%             by taking one of the follwoing  measures:
% 
%                1) Set 'Monontone' = 1 (default)
%                  
%                2) Solve the equivalenve minimization problem
%
%             min_x = 0.5*|| (y/c) - (A/c) x ||_2^2 + (tau/c^2) \phi( x ), 
%
%             where c > 0 ensures that  max eigenvalue of (A'A/c^2) <= 1.
%
%   'alpha' = parameter alpha of TwIST (see ex. (22) of the paper)         
%             Default alpha = alpha(lamN=1, lam1)
%   
%   'beta'  =  parameter beta of twist (see ex. (23) of the paper)
%              Default beta = beta(lamN=1, lam1)            
% 
%  'AT'    = function handle for the function that implements
%            the multiplication by the conjugate of A, when A
%            is a function handle. 
%            If A is an array, AT is ignored.
%
%  'StopCriterion' = type of stopping criterion to use
%                    0 = algorithm stops when the relative 
%                        change in the number of non-zero 
%                        components of the estimate falls 
%                        below 'ToleranceA'
%                    1 = stop when the relative 
%                        change in the objective function 
%                        falls below 'ToleranceA'
%                    2 = stop when the relative norm of the difference between 
%                        two consecutive estimates falls below toleranceA
%                    3 = stop when the objective function 
%                        becomes equal or less than toleranceA.
%                    Default = 1.
%
%  'ToleranceA' = stopping threshold; Default = 0.01
% 
%  'Debias'     = debiasing option: 1 = yes, 0 = no.
%                 Default = 0.
%                 
%                 Note: Debiasing is an operation aimed at the 
%                 computing the solution of the LS problem 
%
%                         arg min_x = 0.5*|| y - A' x' ||_2^2 
%
%                 where A' is the  submatrix of A obatained by
%                 deleting the columns of A corresponding of components
%                 of x set to zero by the TwIST algorithm
%                 
%
%  'ToleranceD' = stopping threshold for the debiasing phase:
%                 Default = 0.0001.
%                 If no debiasing takes place, this parameter,
%                 if present, is ignored.
%
%  'MaxiterA' = maximum number of iterations allowed in the
%               main phase of the algorithm.
%               Default = 1000
%
%  'MiniterA' = minimum number of iterations performed in the
%               main phase of the algorithm.
%               Default = 5
%
%  'MaxiterD' = maximum number of iterations allowed in the
%               debising phase of the algorithm.
%               Default = 200
%
%  'MiniterD' = minimum number of iterations to perform in the
%               debiasing phase of the algorithm.
%               Default = 5
%
%  'Initialization' must be one of {0,1,2,array}
%               0 -> Initialization at zero. 
%               1 -> Random initialization.
%               2 -> initialization with A'*y.
%               array -> initialization provided by the user.
%               Default = 0;
%
%  'Monotone' = enforce monotonic decrease in f. 
%               any nonzero -> enforce monotonicity
%               0 -> don't enforce monotonicity.
%               Default = 1;
%
%  'Sparse'   = {0,1} accelarates the convergence rate when the regularizer 
%               Phi(x) is sparse inducing, such as ||x||_1.
%               Default = 1
%               
%             
%  'True_x' = if the true underlying x is passed in 
%                this argument, MSE evolution is computed
%
%
%  'Verbose'  = work silently (0) or verbosely (1)
%
% ===================================================  
% ============ Outputs ==============================
%   x = solution of the main algorithm
%
%   x_debias = solution after the debiasing phase;
%                  if no debiasing phase took place, this
%                  variable is empty, x_debias = [].
%
%   objective = sequence of values of the objective function
%
%   times = CPU time after each iteration
%
%   debias_start = iteration number at which the debiasing 
%                  phase started. If no debiasing took place,
%                  this variable is returned as zero.
%
%   mses = sequence of MSE values, with respect to True_x,
%          if it was given; if it was not given, mses is empty,
%          mses = [].
%
%   max_svd = inverse of the scaling factor, determined by TwIST,
%             applied to the direct operator (A/max_svd) such that
%             every IST step is increasing.
% ========================================================

%--------------------------------------------------------------
% test for number of required parametres
%--------------------------------------------------------------
if (nargin-length(varargin)) ~= 3
     error('Wrong number of required parameters');
end
%--------------------------------------------------------------
% Set the defaults for the optional parameters
%--------------------------------------------------------------
stopCriterion = 1;
tolA = 0.01;
debias = 0;
maxiter = 1000;
maxiter_debias = 200;
miniter = 5;
miniter_debias = 5;
init = 0;
enforceMonotone = 1;
compute_mse = 0;
plot_ISNR = 0;
AT = 0;
verbose = 1;
alpha = 0;
beta  = 0;
sparse = 1;
tolD = 0.001;
phi_l1 = 0;
psi_ok = 0;
% default eigenvalues 
lam1=1e-4;   lamN=1;
% 

% constants ans internal variables
for_ever = 1;
% maj_max_sv: majorizer for the maximum singular value of operator A
max_svd = 1;

% Set the defaults for outputs that may not be computed
debias_start = 0;
x_debias = [];
mses = [];

%--------------------------------------------------------------
% Read the optional parameters
%--------------------------------------------------------------
if (rem(length(varargin),2)==1)
  error('Optional parameters should always go by pairs');
else
  for i=1:2:(length(varargin)-1)
    switch upper(varargin{i})
     case 'LAMBDA'
       lam1 = varargin{i+1};
     case 'ALPHA'
       alpha = varargin{i+1};
     case 'BETA'
       beta = varargin{i+1};
     case 'PSI'
       psi_function = varargin{i+1};
     case 'PHI'
       phi_function = varargin{i+1};   
     case 'STOPCRITERION'
       stopCriterion = varargin{i+1};
     case 'TOLERANCEA'       
       tolA = varargin{i+1};
     case 'TOLERANCED'
       tolD = varargin{i+1};
     case 'DEBIAS'
       debias = varargin{i+1};
     case 'MAXITERA'
       maxiter = varargin{i+1};
     case 'MAXIRERD'
       maxiter_debias = varargin{i+1};
     case 'MINITERA'
       miniter = varargin{i+1};
     case 'MINITERD'
       miniter_debias = varargin{i+1};
     case 'INITIALIZATION'
       if prod(size(varargin{i+1})) > 1   % we have an initial x
	 init = 33333;    % some flag to be used below
	 x = varargin{i+1};
       else 
	 init = varargin{i+1};
       end
     case 'MONOTONE'
       enforceMonotone = varargin{i+1};
     case 'SPARSE'
       sparse = varargin{i+1};
     case 'TRUE_X'
       compute_mse = 1;
       true = varargin{i+1};
       if prod(double((size(true) == size(y))))
           plot_ISNR = 1;
       end
     case 'AT'
       AT = varargin{i+1};
     case 'VERBOSE'
       verbose = varargin{i+1};
     otherwise
      % Hmmm, something wrong with the parameter string
      error(['Unrecognized option: ''' varargin{i} '''']);
    end;
  end;
end
%%%%%%%%%%%%%%


% twist parameters 
rho0 = (1-lam1/lamN)/(1+lam1/lamN);
if alpha == 0 
    alpha = 2/(1+sqrt(1-rho0^2));
end
if  beta == 0 
    beta  = alpha*2/(lam1+lamN);
end


if (sum(stopCriterion == [0 1 2 3])==0)
   error(['Unknwon stopping criterion']);
end

% if A is a function handle, we have to check presence of AT,
if isa(A, 'function_handle') & ~isa(AT,'function_handle')
   error(['The function handle for transpose of A is missing']);
end 


% if A is a matrix, we find out dimensions of y and x,
% and create function handles for multiplication by A and A',
% so that the code below doesn't have to distinguish between
% the handle/not-handle cases
if ~isa(A, 'function_handle')
   AT = @(x) A'*x;
   A = @(x) A*x;
end
% from this point down, A and AT are always function handles.

% Precompute A'*y since it'll be used a lot
Aty = AT(y);

% if phi was given, check to see if it is a handle and that it 
% accepts two arguments
if exist('psi_function','var')
   if isa(psi_function,'function_handle')
      try  % check if phi can be used, using Aty, which we know has 
           % same size as x
            dummy = psi_function(Aty,tau); 
            psi_ok = 1;
      catch
         error(['Something is wrong with function handle for psi'])
      end
   else
      error(['Psi does not seem to be a valid function handle']);
   end
else %if nothing was given, use soft thresholding
   psi_function = @(x,tau) soft(x,tau);
end

% if psi exists, phi must also exist
if (psi_ok == 1)
   if exist('phi_function','var')
      if isa(phi_function,'function_handle')
         try  % check if phi can be used, using Aty, which we know has 
              % same size as x
              dummy = phi_function(Aty); 
         catch
           error(['Something is wrong with function handle for phi'])
         end
      else
        error(['Phi does not seem to be a valid function handle']);
      end
   else
      error(['If you give Psi you must also give Phi']); 
   end
else  % if no psi and phi were given, simply use the l1 norm.
   phi_function = @(x) sum(abs(x(:))); 
   phi_l1 = 1;
end
    

%--------------------------------------------------------------
% Initialization
%--------------------------------------------------------------
switch init
    case 0   % initialize at zero, using AT to find the size of x
       x = AT(zeros(size(y)));
    case 1   % initialize randomly, using AT to find the size of x
       x = randn(size(AT(zeros(size(y)))));
    case 2   % initialize x0 = A'*y
       x = Aty; 
    case 33333
       % initial x was given as a function argument; just check size
       if size(A(x)) ~= size(y)
          error(['Size of initial x is not compatible with A']); 
       end
    otherwise
       error(['Unknown ''Initialization'' option']);
end

% now check if tau is an array; if it is, it has to 
% have the same size as x
if prod(size(tau)) > 1
   try,
      dummy = x.*tau;
   catch,
      error(['Parameter tau has wrong dimensions; it should be scalar or size(x)']),
   end
end
      
% if the true x was given, check its size
if compute_mse & (size(true) ~= size(x))  
   error(['Initial x has incompatible size']); 
end


% if tau is large enough, in the case of phi = l1, thus psi = soft,
% the optimal solution is the zero vector
if phi_l1
   max_tau = max(abs(Aty(:)));
   if (tau >= max_tau)&(psi_ok==0)
      x = zeros(size(Aty));
      objective(1) = 0.5*(y(:)'*y(:));
      times(1) = 0;
      if compute_mse
        mses(1) = sum(true(:).^2);
      end
      return
   end
end


% define the indicator vector or matrix of nonzeros in x
nz_x = (x ~= 0.0);
num_nz_x = sum(nz_x(:));

% Compute and store initial value of the objective function
resid =  y-A(x);
prev_f = 0.5*(resid(:)'*resid(:)) + tau*phi_function(x);


% start the clock
t0 = cputime;

times(1) = cputime - t0;
objective(1) = prev_f;

if compute_mse
   mses(1) = sum(sum((x-true).^2));
end

cont_outer = 1;
iter = 1;

if verbose
    fprintf(1,'\nInitial objective = %10.6e,  nonzeros=%7d\n',...
        prev_f,num_nz_x);
end

% variables controling first and second order iterations
IST_iters = 0;
TwIST_iters = 0;

% initialize
xm2=x;
xm1=x;

%--------------------------------------------------------------
% TwIST iterations
%--------------------------------------------------------------
while cont_outer
    % gradient
    grad = AT(resid);
    while for_ever
        % IST estimate
        x = psi_function(xm1 + grad/max_svd,tau/max_svd);
        if (IST_iters >= 2) | ( TwIST_iters ~= 0)
            % set to zero the past when the present is zero
            % suitable for sparse inducing priors
            if sparse
                mask = (x ~= 0);
                xm1 = xm1.* mask;
                xm2 = xm2.* mask;
            end
            % two-step iteration
            xm2 = (alpha-beta)*xm1 + (1-alpha)*xm2 + beta*x;
            % compute residual
            resid = y-A(xm2);
            f = 0.5*(resid(:)'*resid(:)) + tau*phi_function(xm2);
            if (f > prev_f) & (enforceMonotone)
                TwIST_iters = 0;  % do a IST iteration if monotonocity fails
            else
                TwIST_iters = TwIST_iters+1; % TwIST iterations
                IST_iters = 0;
                x = xm2;
                if mod(TwIST_iters,10000) == 0
                    max_svd = 0.9*max_svd;
                end
                break;  % break loop while
            end
        else
            resid = y-A(x);
            f = 0.5*(resid(:)'*resid(:)) + tau*phi_function(x);
            if f > prev_f
                % if monotonicity  fails here  is  because
                % max eig (A'A) > 1. Thus, we increase our guess
                % of max_svs
                max_svd = 2*max_svd;
                if verbose
                    fprintf('Incrementing S=%2.2e\n',max_svd)
                end
                IST_iters = 0;
                TwIST_iters = 0;
            else
                TwIST_iters = TwIST_iters + 1;
                break;  % break loop while
            end
        end
    end

    xm2 = xm1;
    xm1 = x;



    %update the number of nonzero components and its variation
    nz_x_prev = nz_x;
    nz_x = (x~=0.0);
    num_nz_x = sum(nz_x(:));
    num_changes_active = (sum(nz_x(:)~=nz_x_prev(:)));

    % take no less than miniter and no more than maxiter iterations
    switch stopCriterion
        case 0,
            % compute the stopping criterion based on the change
            % of the number of non-zero components of the estimate
            criterion =  num_changes_active;
        case 1,
            % compute the stopping criterion based on the relative
            % variation of the objective function.
            criterion = abs(f-prev_f)/prev_f;
        case 2,
            % compute the stopping criterion based on the relative
            % variation of the estimate.
            criterion = (norm(x(:)-xm1(:))/norm(x(:)));
        case 3,
            % continue if not yet reached target value tolA
            criterion = f;
        otherwise,
            error(['Unknwon stopping criterion']);
    end
    cont_outer = ((iter <= maxiter) & (criterion > tolA));
    if iter <= miniter
        cont_outer = 1;
    end



    iter = iter + 1;
    prev_f = f;
    objective(iter) = f;
    times(iter) = cputime-t0;

    if compute_mse
        err = true - x;
        mses(iter) = (err(:)'*err(:));
    end

    % print out the various stopping criteria
    if verbose
        if plot_ISNR
            fprintf(1,'Iteration=%4d, ISNR=%4.5e  objective=%9.5e, nz=%7d, criterion=%7.3e\n',...
                iter, 10*log10(sum((y(:)-true(:)).^2)/sum((x(:)-true(:)).^2) ), ...
                f, num_nz_x, criterion/tolA);
        else
            fprintf(1,'Iteration=%4d, objective=%9.5e, nz=%7d,  criterion=%7.3e\n',...
                iter, f, num_nz_x, criterion/tolA);
        end
    end

end
%--------------------------------------------------------------
% end of the main loop
%--------------------------------------------------------------

% Printout results
if verbose
    fprintf(1,'\nFinished the main algorithm!\nResults:\n')
    fprintf(1,'||A x - y ||_2 = %10.3e\n',resid(:)'*resid(:))
    fprintf(1,'||x||_1 = %10.3e\n',sum(abs(x(:))))
    fprintf(1,'Objective function = %10.3e\n',f);
    fprintf(1,'Number of non-zero components = %d\n',num_nz_x);
    fprintf(1,'CPU time so far = %10.3e\n', times(iter));
    fprintf(1,'\n');
end


%--------------------------------------------------------------
% If the 'Debias' option is set to 1, we try to
% remove the bias from the l1 penalty, by applying CG to the
% least-squares problem obtained by omitting the l1 term
% and fixing the zero coefficients at zero.
%--------------------------------------------------------------
if debias
    if verbose
        fprintf(1,'\n')
        fprintf(1,'Starting the debiasing phase...\n\n')
    end

    x_debias = x;
    zeroind = (x_debias~=0);
    cont_debias_cg = 1;
    debias_start = iter;

    % calculate initial residual
    resid = A(x_debias);
    resid = resid-y;
    resid_prev = eps*ones(size(resid));

    rvec = AT(resid);

    % mask out the zeros
    rvec = rvec .* zeroind;
    rTr_cg = rvec(:)'*rvec(:);

    % set convergence threshold for the residual || RW x_debias - y ||_2
    tol_debias = tolD * (rvec(:)'*rvec(:));

    % initialize pvec
    pvec = -rvec;

    % main loop
    while cont_debias_cg

        % calculate A*p = Wt * Rt * R * W * pvec
        RWpvec = A(pvec);
        Apvec = AT(RWpvec);

        % mask out the zero terms
        Apvec = Apvec .* zeroind;

        % calculate alpha for CG
        alpha_cg = rTr_cg / (pvec(:)'* Apvec(:));

        % take the step
        x_debias = x_debias + alpha_cg * pvec;
        resid = resid + alpha_cg * RWpvec;
        rvec  = rvec  + alpha_cg * Apvec;

        rTr_cg_plus = rvec(:)'*rvec(:);
        beta_cg = rTr_cg_plus / rTr_cg;
        pvec = -rvec + beta_cg * pvec;

        rTr_cg = rTr_cg_plus;

        iter = iter+1;

        objective(iter) = 0.5*(resid(:)'*resid(:)) + ...
            tau*phi_function(x_debias(:));
        times(iter) = cputime - t0;

        if compute_mse
            err = true - x_debias;
            mses(iter) = (err(:)'*err(:));
        end

        % in the debiasing CG phase, always use convergence criterion
        % based on the residual (this is standard for CG)
        if verbose
            fprintf(1,' Iter = %5d, debias resid = %13.8e, convergence = %8.3e\n', ...
                iter, resid(:)'*resid(:), rTr_cg / tol_debias);
        end
        cont_debias_cg = ...
            (iter-debias_start <= miniter_debias )| ...
            ((rTr_cg > tol_debias) & ...
            (iter-debias_start <= maxiter_debias));

    end
    if verbose
        fprintf(1,'\nFinished the debiasing phase!\nResults:\n')
        fprintf(1,'||A x - y ||_2 = %10.3e\n',resid(:)'*resid(:))
        fprintf(1,'||x||_1 = %10.3e\n',sum(abs(x(:))))
        fprintf(1,'Objective function = %10.3e\n',f);
        nz = (x_debias~=0.0);
        fprintf(1,'Number of non-zero components = %d\n',sum(nz(:)));
        fprintf(1,'CPU time so far = %10.3e\n', times(iter));
        fprintf(1,'\n');
    end
end

if compute_mse
    mses = mses/length(true(:));
end


%--------------------------------------------------------------
% soft for both real and  complex numbers
%--------------------------------------------------------------
function y = soft(x,T)
%y = sign(x).*max(abs(x)-tau,0);
y = max(abs(x) - T, 0);
y = y./(y+T) .* x;

5、TwIST算法測試

        這裏測試代碼基本與IST測試代碼相同,略作修改,對比本人所寫的TwIST_Basic、IST_Basic(參見壓縮感知重構算法之迭代軟閾值(IST))以及官方版本的TwIST: 

clear all;close all;clc;        
M = 64;%觀測值個數        
N = 256;%信號x的長度        
K = 10;%信號x的稀疏度        
Index_K = randperm(N);        
x = zeros(N,1);        
x(Index_K(1:K)) = 5*randn(K,1);%x爲K稀疏的,且位置是隨機的 
%x(Index_K(1:K)) = sign(5*randn(K,1));             
Phi = randn(M,N);%測量矩陣爲高斯矩陣    
Phi = orth(Phi')';            
sigma = 0.005;      
e = sigma*randn(M,1);    
y = Phi * x + e;%得到觀測向量y        
% y = Phi * x;%得到觀測向量y 
% lamda = sigma*sqrt(2*log(N));
lamda = 0.1*max(abs(Phi'*y));
fprintf('\nlamda = %f\n',lamda);      
%% 恢復重構信號x   
%(1)TwIST_Basic
fprintf('\nTwIST_Basic begin...');  
tic
x_r1 = TwIST_Basic(y,Phi,lamda); 
toc  
%Debias
[xsorted inds] = sort(abs(x_r1), 'descend'); 
AI = Phi(:,inds(xsorted(:)>1e-3));
xI = pinv(AI'*AI)*AI'*y;
x_bias1 = zeros(length(x),1);
x_bias1(inds(xsorted(:)>1e-3)) = xI;
%(2)IST_Basic
fprintf('\nIST_Basic begin...');  
tic
x_r2 = IST_Basic(y,Phi,lamda); 
toc  
%Debias
[xsorted inds] = sort(abs(x_r2), 'descend'); 
AI = Phi(:,inds(xsorted(:)>1e-3));
xI = pinv(AI'*AI)*AI'*y;
x_bias2 = zeros(length(x),1);
x_bias2(inds(xsorted(:)>1e-3)) = xI;
%(3)TwIST
fprintf('\nTwIST begin...\n');  
tic
x_r3 =  TwIST(y,Phi,lamda,'Monotone',0,'Verbose',0); 
toc  
%Debias
[xsorted inds] = sort(abs(x_r3), 'descend'); 
AI = Phi(:,inds(xsorted(:)>1e-3));
xI = pinv(AI'*AI)*AI'*y;
x_bias3 = zeros(length(x),1);
x_bias3(inds(xsorted(:)>1e-3)) = xI;
%% 繪圖        
figure;        
plot(x_bias1,'k.-');%繪出x的恢復信號        
hold on;        
plot(x,'r');%繪出原信號x        
hold off;        
legend('TwIST\_Basic','Original')        
fprintf('\n恢復殘差(TwIST_Basic):');        
fprintf('%f\n',norm(x_bias1-x));%恢復殘差  
%Debias
figure;        
plot(x_bias2,'k.-');%繪出x的恢復信號        
hold on;        
plot(x,'r');%繪出原信號x        
hold off;        
legend('IST\_Basic','Original')        
fprintf('恢復殘差(IST_Basic):');        
fprintf('%f\n',norm(x_bias2-x));%恢復殘差  
%Debias1
figure;        
plot(x_bias3,'k.-');%繪出x的恢復信號        
hold on;        
plot(x,'r');%繪出原信號x        
hold off;        
legend('TwIST','Original')        
fprintf('恢復殘差(TwIST):');        
fprintf('%f\n',norm(x_bias3-x));%恢復殘差  

        運行結果如下:(信號爲隨機生成,所以每次結果均不一樣)

1)圖(均爲debias之後,參見壓縮感知重構算法之迭代軟閾值(IST)




2)Command Windows

lamda = 0.224183


TwIST_Basic begin...
abs(f_pre1-f_pre2)/f_pre2<0.010000
TwIST loop is 21
Elapsed time is 0.011497 seconds.


IST_Basic begin...
abs(f-f_pre)/f_pre<0.010000
IST loop is 14
Elapsed time is 0.006397 seconds.


TwIST begin...
Elapsed time is 0.029590 seconds.


恢復殘差(TwIST_Basic):0.047943
恢復殘差(IST_Basic):4.928160
恢復殘差(TwIST):0.047943

注:重構經常失敗,運行結果僅爲挑了一次重構效果較好的附上;IST有時也可以較好的重構,湊巧這次重構效果不佳。

6、結束語

        從重構結果來看,尤其是Command Windows的輸出結果來看,TwIST相比於IST並沒有什麼優勢,這可能是由於我的測試案例十分簡單的原因吧。

        另外,談一下個人對TwIST_Basic函數中第42~44行(或TwIST函數中第490~494行)的理解:這幾行很關鍵,直接影響着能否重構,雖然加上後也經常重構失敗(但去掉這幾行代碼基本無法重構)。個人認爲原因是這樣的,TwIST容易發散,加上這幾行代碼後實際是把x非零值的位置依靠IST去決定(因爲決定mask的實際是IST迭代公式計算結果),而x迭代則使用TwIST迭代公式。當然,這僅爲個人看法。

        最後再說一句,在文中壓根就沒提“壓縮感知”,但由於TwIST可以求解基追蹤降噪(BPDN)問題,所以也就可以拿來作爲壓縮感知重構算法了……

        給出TwIST兩位作者的個人主頁:

        Jose M. Bioucas Dias:http://www.lx.it.pt/~bioucas/

        MárioA. T. Figueiredo:http://www.lx.it.pt/~mtf/

        湊巧的是在第一作者主頁上居然還發現了一個研究生同學,感嘆世界好小,哈哈……

發佈了181 篇原創文章 · 獲贊 1199 · 訪問量 316萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章