MCI Retrieval by Monte Carlo integration A Bayesian inversion is performed by using a precalculated database. The retrieval algoithm is descibed in e.g. Evans et al., Submillimeter-wave cloud ice radiometer: Simulations of retrieval algorithm performance, JGR, 107, D3, 2002. The match between the measurement and the database cases is reported as exp(-chi2/2)/exp(-m/2), where m=size(Y,1) and chi2 is defined in the function with same name. With the vector p you can give different prior weights to the different training points. This is done AFTER the check on which points are above the weight threshold for the retrieval. (Otherwise this would be an unfair advantage for those training points with large prior weights.) The prior p is also taken into account in sum_w, so that the retrieval is correctly normalized. It is NOT taken into account in the number of hits, so if we find one training point with prior weight 10, the number of hits is 1, not 10. Some details of the inversion is controled by the structure M. Fields can be left out. Defined fields are use_all : If set to 1 all cases in database are weighted together. If 0, only cases with a weight > w_limit are considered. Default is true. norm_w : If set to false, the normalisation of weights by exp(-m/2) is NOT applied. This makes the weights to depend more strongly on m, but allows to be consistent with how the weights are normally defined. Limit and sum threshold must be adopted if this field is changed. Default is true. w_limit : Threshold level for what is considered as an OK databse hit. Used by *use_all* and controls D.n_hit. Default is 0.01. verb : If set to 1 some text output is generated. Default is false sum_w_thresh : Threshold for sum_w for which retrieval is performed. If the sum_w is smaller or equal to the treshold, a value of NaN is assigned for that retrieval case. Default value is 0 (to avoid division by zero). n_hit_thresh : Threshold for number of database hits (weight exceeding w_limit) for which retrieval is performed. If the number is smaller than the treshold, a value of NaN is assigned for that retrieval case. Default value is 0. MCI specific retrieval diagnostics is given in the structure array D, having the fields n_con : Number of considered cases n_hit : Number of cases with weight >= *w_limit* sum_w : Sum of weights (for considered cases) max_w : Maximum weight value To plot e.g. n_hit: plot([D(:).n_hit]) FORMAT [Xh,E,D,W] = mci(M,Xb,Yb,Se,Y[,p]) OUT Xh Retrieved states (that is, x-hat). E Retrieval error. D Diagnostics of the retrieval. See further above. W Weights for each database entry and each retrieval. IN M Settings for the MCI retrieval. See further above. Xb States of retrieval data base. Yb Measurements corresponding to Xb. Se Observation unvertainty covariance matrix. Y Measurements to be inverted. OPT p Prior weight of given training point. Default is 1.
0001 % MCI Retrieval by Monte Carlo integration 0002 % 0003 % A Bayesian inversion is performed by using a precalculated database. 0004 % The retrieval algoithm is descibed in e.g. 0005 % Evans et al., Submillimeter-wave cloud ice radiometer: Simulations 0006 % of retrieval algorithm performance, JGR, 107, D3, 2002. 0007 % 0008 % The match between the measurement and the database cases is reported 0009 % as exp(-chi2/2)/exp(-m/2), where m=size(Y,1) and chi2 is defined in 0010 % the function with same name. 0011 % 0012 % With the vector p you can give different prior weights to the 0013 % different training points. This is done AFTER the check on 0014 % which points are above the weight threshold for the 0015 % retrieval. (Otherwise this would be an unfair advantage for 0016 % those training points with large prior weights.) The prior p is 0017 % also taken into account in sum_w, so that the retrieval is 0018 % correctly normalized. It is NOT taken into account in the 0019 % number of hits, so if we find one training point with prior 0020 % weight 10, the number of hits is 1, not 10. 0021 % 0022 % Some details of the inversion is controled by the structure M. Fields 0023 % can be left out. Defined fields are 0024 % use_all : If set to 1 all cases in database are weighted together. 0025 % If 0, only cases with a weight > w_limit are considered. 0026 % Default is true. 0027 % norm_w : If set to false, the normalisation of weights by 0028 % exp(-m/2) is NOT applied. This makes the weights to 0029 % depend more strongly on m, but allows to be consistent 0030 % with how the weights are normally defined. Limit and 0031 % sum threshold must be adopted if this field is changed. 0032 % Default is true. 0033 % w_limit : Threshold level for what is considered as an OK 0034 % databse hit. Used by *use_all* and controls D.n_hit. 0035 % Default is 0.01. 0036 % verb : If set to 1 some text output is generated. 0037 % Default is false 0038 % sum_w_thresh : Threshold for sum_w for which retrieval is performed. 0039 % If the sum_w is smaller or equal to the treshold, a 0040 % value of NaN is assigned for that retrieval case. 0041 % Default value is 0 (to avoid division by zero). 0042 % n_hit_thresh : Threshold for number of database hits (weight 0043 % exceeding w_limit) for which retrieval is performed. 0044 % If the number is smaller than the treshold, a value 0045 % of NaN is assigned for that retrieval case. 0046 % Default value is 0. 0047 % 0048 % MCI specific retrieval diagnostics is given in the structure array D, 0049 % having the fields 0050 % n_con : Number of considered cases 0051 % n_hit : Number of cases with weight >= *w_limit* 0052 % sum_w : Sum of weights (for considered cases) 0053 % max_w : Maximum weight value 0054 % To plot e.g. n_hit: plot([D(:).n_hit]) 0055 % 0056 % FORMAT [Xh,E,D,W] = mci(M,Xb,Yb,Se,Y[,p]) 0057 % 0058 % OUT Xh Retrieved states (that is, x-hat). 0059 % E Retrieval error. 0060 % D Diagnostics of the retrieval. See further above. 0061 % W Weights for each database entry and each retrieval. 0062 % IN M Settings for the MCI retrieval. See further above. 0063 % Xb States of retrieval data base. 0064 % Yb Measurements corresponding to Xb. 0065 % Se Observation unvertainty covariance matrix. 0066 % Y Measurements to be inverted. 0067 % OPT p Prior weight of given training point. Default is 1. 0068 0069 % 2005-11-23 Created by Patrick Eriksson. 0070 % 2006-03-28 Extended to use prior weights and for more 0071 % diagnostics by Stefan Buehler (details see Atmlab Changelog). 0072 0073 function [Xh,E,D,W] = mci(M,Xb,Yb,Se,Y,p) 0074 0075 % Check input 0076 if ~isfield( M, 'use_all' ) 0077 M.use_all = 1; 0078 end 0079 if ~isfield( M, 'norm_w' ) 0080 M.norm_w = true; 0081 end 0082 if ~isfield( M, 'w_limit' ) 0083 M.w_limit = 0.01; 0084 end 0085 if ~isfield( M, 'verb' ) 0086 M.verb = 0; 0087 end 0088 if ~isfield( M, 'sum_w_thresh' ) 0089 M.sum_w_thresh = 0; 0090 end 0091 if ~isfield( M, 'n_hit_thresh' ) 0092 M.n_hit_thresh = 0; 0093 end 0094 0095 0096 % Some sizes 0097 nb = size(Xb,2); 0098 m = size(Y,1); 0099 ny = size(Y,2); 0100 lx = size(Xb,1); 0101 0102 0103 % If no prior weights p are given, we set all prior weights to 1: 0104 if nargin == 5 0105 p = ones(1,nb); 0106 else 0107 % Check that length of p matches Xb: 0108 if length(p) ~= nb 0109 error('Dimension of p does not match Xb') 0110 end 0111 end 0112 0113 0114 % Check that dimensions of Xb and Yb are consistent: 0115 if size(Yb,2) ~= nb 0116 error('Dimensions of Xb and Yb do not match') 0117 end 0118 0119 0120 % Seems that nargout can get corrupted inside parfor! 0121 % Use a local variable to be safe. 0122 nout = nargout; 0123 0124 0125 0126 % Init output arguments 0127 % 0128 Xh = zeros( lx, ny ); 0129 % 0130 if nout > 1 0131 E = zeros( lx, ny ); 0132 end 0133 if nout > 3 0134 W = zeros( nb, ny ); 0135 end 0136 % 0137 D(ny) = struct( 'sum_w',NaN, 'max_w',NaN, 'n_con',NaN, 'n_hit',NaN ); 0138 0139 0140 for i = 1 : ny 0141 0142 w = exp( -0.5*chi2( repmat(Y(:,i),1,nb)-Yb, Se ) ); 0143 0144 if M.norm_w 0145 w = w / exp(-m*(1-2/9/m)^3/2); 0146 end 0147 0148 ind_hit = find( w >= M.w_limit ); 0149 0150 if M.use_all 0151 ind = 1:nb; 0152 else 0153 ind = ind_hit; 0154 end 0155 0156 % Catch the case that ind is empty 0157 if length(ind) == 0 0158 D(i).sum_w = 0; 0159 D(i).max_w = 0; 0160 D(i).n_con = 0; 0161 D(i).n_hit = 0; 0162 else 0163 D(i).sum_w = sum( p(ind)*w(ind) ); % Weight with prior. 0164 D(i).max_w = max( w(ind) ); 0165 D(i).n_con = length(ind); 0166 D(i).n_hit = length(ind_hit); 0167 end 0168 0169 % Prepare the weights for the selected training cases for the 0170 % retrieval. We also multiply in the prior weights here. 0171 Wi = repmat(p(ind).*w(ind)',lx,1); 0172 0173 % The retrieval is only calculated if the weight sum is above the 0174 % treshold. The default value for the threshold is zero, to avoid 0175 % division by zero. 0176 % 0177 % Likewise, or alternatively, the retrieval is only calculated if 0178 % the number of hits is larger than n_hit_thresh. 0179 if ( D(i).sum_w > M.sum_w_thresh ) && ( D(i).n_hit >= M.n_hit_thresh ) 0180 0181 Xh(:,i) = sum( Xb(:,ind).*Wi, 2 ) ./ D(i).sum_w; 0182 0183 if nout > 1 0184 E(:,i) = sqrt( sum( ... 0185 (Xb(:,ind)-repmat(Xh(:,i),1,D(i).n_con)).^2.*Wi, 2 ) ./ D(i).sum_w ); 0186 if nout > 3 0187 W(:,i) = w; 0188 end 0189 end 0190 else 0191 Xh(:,i) = NaN; 0192 0193 if nout > 1 0194 E(:,i) = NaN; 0195 if nout > 3 0196 W(:,i) = w; 0197 end 0198 end 0199 end 0200 0201 if M.verb 0202 disp(sprintf(['Measurement %d of %d: ' ... 0203 'sum_w = %10.0f, ' ... 0204 'max_w =%10.5f, ' ... 0205 'w_ratio =%10.5f, ' ... 0206 'n_con = %d.'],... 0207 i, ny, ... 0208 D(i).sum_w, D(i).max_w, ... 0209 D(i).max_w / D(i).sum_w, ... 0210 D(i).n_con )); 0211 end 0212 0213 end 0214