Home > atmlab > retrieval > mci.m

mci

PURPOSE ^

MCI Retrieval by Monte Carlo integration

SYNOPSIS ^

function [Xh,E,D,W] = mci(M,Xb,Yb,Se,Y,p)

DESCRIPTION ^

 MCI   Retrieval by Monte Carlo integration

   A Bayesian inversion is performed by using a precalculated database.
   The retrieval algoithm is descibed in e.g.
      Evans et al., Submillimeter-wave cloud ice radiometer: Simulations
      of retrieval algorithm performance, JGR, 107, D3, 2002.

   The match between the measurement and the database cases is reported
   as exp(-chi2/2)/exp(-m/2), where m=size(Y,1) and chi2 is defined in
   the function with same name.

   With the vector p you can give different prior weights to the
   different training points. This is done AFTER the check on
   which points are above the weight threshold for the
   retrieval. (Otherwise this would be an unfair advantage for
   those training points with large prior weights.) The prior p is
   also taken into account in sum_w, so that the retrieval is
   correctly normalized. It is NOT taken into account in the
   number of hits, so if we find one training point with prior
   weight 10, the number of hits is 1, not 10.

   Some details of the inversion is controled by the structure M. Fields
   can be left out. Defined fields are
      use_all :      If set to 1 all cases in database are weighted together.
                     If 0, only cases with a weight > w_limit are considered.
                     Default is true.                
      norm_w  :      If set to false, the normalisation of weights by 
                     exp(-m/2) is NOT applied. This makes the weights to 
                     depend more strongly on m, but allows to be consistent
                     with how the weights are normally defined. Limit and
                     sum threshold must be adopted if this field is changed.
                     Default is true.
      w_limit :      Threshold level for what is considered as an OK
                     databse hit. Used by *use_all* and controls D.n_hit. 
                     Default is 0.01.
      verb    :      If set to 1 some text output is generated. 
                     Default is false
      sum_w_thresh : Threshold for sum_w for which retrieval is performed. 
                     If the sum_w is smaller or equal to the treshold, a 
                     value of NaN is assigned for that retrieval case. 
                     Default value is 0 (to avoid division by zero).
      n_hit_thresh : Threshold for number of database hits (weight
                     exceeding w_limit) for which retrieval is performed. 
                     If the number is smaller than the treshold, a value 
                     of NaN is assigned for that retrieval case. 
                     Default value is 0.

    MCI specific retrieval diagnostics is given in the structure array D, 
    having the fields
      n_con : Number of considered cases
      n_hit : Number of cases with weight >= *w_limit*
      sum_w : Sum of weights (for considered cases)
      max_w : Maximum weight value
    To plot e.g. n_hit: plot([D(:).n_hit])

 FORMAT   [Xh,E,D,W] = mci(M,Xb,Yb,Se,Y[,p])
        
 OUT   Xh   Retrieved states (that is, x-hat).
       E    Retrieval error.
       D    Diagnostics of the retrieval. See further above.
       W    Weights for each database entry and each retrieval.
 IN    M    Settings for the MCI retrieval.  See further above.
       Xb   States of retrieval data base.
       Yb   Measurements corresponding to Xb.
       Se   Observation unvertainty covariance matrix.
       Y    Measurements to be inverted.
 OPT   p    Prior weight of given training point. Default is 1.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

DOWNLOAD ^

mci.m

SOURCE CODE ^

0001 % MCI   Retrieval by Monte Carlo integration
0002 %
0003 %   A Bayesian inversion is performed by using a precalculated database.
0004 %   The retrieval algoithm is descibed in e.g.
0005 %      Evans et al., Submillimeter-wave cloud ice radiometer: Simulations
0006 %      of retrieval algorithm performance, JGR, 107, D3, 2002.
0007 %
0008 %   The match between the measurement and the database cases is reported
0009 %   as exp(-chi2/2)/exp(-m/2), where m=size(Y,1) and chi2 is defined in
0010 %   the function with same name.
0011 %
0012 %   With the vector p you can give different prior weights to the
0013 %   different training points. This is done AFTER the check on
0014 %   which points are above the weight threshold for the
0015 %   retrieval. (Otherwise this would be an unfair advantage for
0016 %   those training points with large prior weights.) The prior p is
0017 %   also taken into account in sum_w, so that the retrieval is
0018 %   correctly normalized. It is NOT taken into account in the
0019 %   number of hits, so if we find one training point with prior
0020 %   weight 10, the number of hits is 1, not 10.
0021 %
0022 %   Some details of the inversion is controled by the structure M. Fields
0023 %   can be left out. Defined fields are
0024 %      use_all :      If set to 1 all cases in database are weighted together.
0025 %                     If 0, only cases with a weight > w_limit are considered.
0026 %                     Default is true.
0027 %      norm_w  :      If set to false, the normalisation of weights by
0028 %                     exp(-m/2) is NOT applied. This makes the weights to
0029 %                     depend more strongly on m, but allows to be consistent
0030 %                     with how the weights are normally defined. Limit and
0031 %                     sum threshold must be adopted if this field is changed.
0032 %                     Default is true.
0033 %      w_limit :      Threshold level for what is considered as an OK
0034 %                     databse hit. Used by *use_all* and controls D.n_hit.
0035 %                     Default is 0.01.
0036 %      verb    :      If set to 1 some text output is generated.
0037 %                     Default is false
0038 %      sum_w_thresh : Threshold for sum_w for which retrieval is performed.
0039 %                     If the sum_w is smaller or equal to the treshold, a
0040 %                     value of NaN is assigned for that retrieval case.
0041 %                     Default value is 0 (to avoid division by zero).
0042 %      n_hit_thresh : Threshold for number of database hits (weight
0043 %                     exceeding w_limit) for which retrieval is performed.
0044 %                     If the number is smaller than the treshold, a value
0045 %                     of NaN is assigned for that retrieval case.
0046 %                     Default value is 0.
0047 %
0048 %    MCI specific retrieval diagnostics is given in the structure array D,
0049 %    having the fields
0050 %      n_con : Number of considered cases
0051 %      n_hit : Number of cases with weight >= *w_limit*
0052 %      sum_w : Sum of weights (for considered cases)
0053 %      max_w : Maximum weight value
0054 %    To plot e.g. n_hit: plot([D(:).n_hit])
0055 %
0056 % FORMAT   [Xh,E,D,W] = mci(M,Xb,Yb,Se,Y[,p])
0057 %
0058 % OUT   Xh   Retrieved states (that is, x-hat).
0059 %       E    Retrieval error.
0060 %       D    Diagnostics of the retrieval. See further above.
0061 %       W    Weights for each database entry and each retrieval.
0062 % IN    M    Settings for the MCI retrieval.  See further above.
0063 %       Xb   States of retrieval data base.
0064 %       Yb   Measurements corresponding to Xb.
0065 %       Se   Observation unvertainty covariance matrix.
0066 %       Y    Measurements to be inverted.
0067 % OPT   p    Prior weight of given training point. Default is 1.
0068 
0069 % 2005-11-23   Created by Patrick Eriksson.
0070 % 2006-03-28   Extended to use prior weights and for more
0071 %              diagnostics by Stefan Buehler (details see Atmlab Changelog).
0072 
0073 function [Xh,E,D,W] = mci(M,Xb,Yb,Se,Y,p)
0074 
0075 % Check input
0076 if ~isfield( M, 'use_all' )
0077   M.use_all = 1;
0078 end
0079 if ~isfield( M, 'norm_w' )
0080   M.norm_w = true;
0081 end
0082 if ~isfield( M, 'w_limit' )
0083   M.w_limit = 0.01;
0084 end
0085 if ~isfield( M, 'verb' )
0086   M.verb = 0;
0087 end
0088 if ~isfield( M, 'sum_w_thresh' )
0089   M.sum_w_thresh = 0;
0090 end
0091 if ~isfield( M, 'n_hit_thresh' )
0092   M.n_hit_thresh = 0;
0093 end
0094 
0095 
0096 % Some sizes
0097 nb = size(Xb,2);
0098 m  = size(Y,1);
0099 ny = size(Y,2);
0100 lx = size(Xb,1);
0101 
0102 
0103 % If no prior weights p are given, we set all prior weights to 1:
0104 if nargin == 5
0105   p = ones(1,nb);
0106 else
0107   % Check that length of p matches Xb:
0108   if length(p) ~= nb 
0109     error('Dimension of p does not match Xb')
0110   end
0111 end
0112 
0113 
0114 % Check that dimensions of Xb and Yb are consistent:
0115 if size(Yb,2) ~= nb
0116     error('Dimensions of Xb and Yb do not match')
0117 end
0118 
0119 
0120 % Seems that nargout can get corrupted inside parfor!
0121 % Use a local variable to be safe.
0122 nout = nargout;
0123 
0124 
0125 
0126 % Init output arguments
0127 %
0128 Xh = zeros( lx, ny );
0129 %
0130 if nout > 1
0131   E  = zeros( lx, ny );
0132 end
0133 if nout > 3
0134   W  = zeros( nb, ny );
0135 end
0136 %
0137 D(ny) = struct( 'sum_w',NaN, 'max_w',NaN, 'n_con',NaN, 'n_hit',NaN );
0138 
0139 
0140 for i = 1 : ny
0141 
0142   w = exp( -0.5*chi2( repmat(Y(:,i),1,nb)-Yb, Se ) );
0143 
0144   if M.norm_w 
0145     w = w /  exp(-m*(1-2/9/m)^3/2);
0146   end
0147   
0148   ind_hit = find( w >= M.w_limit );    
0149   
0150   if M.use_all
0151     ind = 1:nb;
0152   else
0153     ind = ind_hit;
0154   end
0155 
0156   % Catch the case that ind is empty
0157   if length(ind) == 0
0158     D(i).sum_w = 0;
0159     D(i).max_w = 0;
0160     D(i).n_con = 0;
0161     D(i).n_hit = 0;
0162   else
0163     D(i).sum_w = sum( p(ind)*w(ind) );    % Weight with prior.
0164     D(i).max_w = max( w(ind) );
0165     D(i).n_con = length(ind);
0166     D(i).n_hit = length(ind_hit);
0167   end
0168   
0169   % Prepare the weights for the selected training cases for the
0170   % retrieval. We also multiply in the prior weights here.
0171   Wi = repmat(p(ind).*w(ind)',lx,1);
0172 
0173   % The retrieval is only calculated if the weight sum is above the
0174   % treshold. The default value for the threshold is zero, to avoid
0175   % division by zero.
0176   %
0177   % Likewise, or alternatively, the retrieval is only calculated if
0178   % the number of hits is larger than n_hit_thresh.
0179   if ( D(i).sum_w > M.sum_w_thresh )  &&  ( D(i).n_hit >= M.n_hit_thresh )
0180     
0181     Xh(:,i) = sum( Xb(:,ind).*Wi, 2 ) ./ D(i).sum_w;
0182 
0183     if nout > 1
0184       E(:,i) = sqrt( sum( ...
0185         (Xb(:,ind)-repmat(Xh(:,i),1,D(i).n_con)).^2.*Wi, 2 ) ./ D(i).sum_w );
0186       if nout > 3
0187         W(:,i) = w;
0188       end
0189     end  
0190   else
0191     Xh(:,i) = NaN;
0192   
0193     if nout > 1
0194       E(:,i) = NaN;
0195       if nout > 3
0196         W(:,i) = w;
0197       end
0198     end  
0199   end
0200   
0201   if M.verb
0202     disp(sprintf(['Measurement %d of %d: ' ...
0203           'sum_w = %10.0f, ' ...
0204           'max_w =%10.5f, ' ...
0205           'w_ratio =%10.5f, ' ...
0206           'n_con = %d.'],...
0207          i, ny, ...
0208          D(i).sum_w, D(i).max_w, ...
0209          D(i).max_w / D(i).sum_w, ...
0210          D(i).n_con ));
0211   end
0212 
0213 end
0214

Generated on Mon 15-Sep-2014 13:31:28 by m2html © 2005