Home > atmlab > math > binning_fast.m

binning_fast

PURPOSE ^

% BINNING_FAST bins ungridded data onto a grid

SYNOPSIS ^

function [Z,Y,X] = binning_fast(in,lonspace)

DESCRIPTION ^

% BINNING_FAST bins ungridded data onto a grid

 Purpose:
 Bins ungridded data to a cell grid using d = in.gridsize. Each bin will contain a
 vector of values for that bin.

 The structure elements lon,lat,data (or X,Y,Z) must be the same size, and the function
 will preserve the variable class.

 IN
   in       struct containing:
               MANDITORY:
                   Z/data         [data matrix]     of any class. size(data,1)
                                                  must be the same length
                                                  as lat lon, but data may
                                                  have 2 dimensions or more, e.g.
                                                  lvls, seasons, etc.

                   X/lon          [lon matrix]      of any class
                   Y/lat          [lat matrix]      of any class

                 either:
                   gridsize     [%f] or [%f %f]   scalar for square
                                                       scaling. 2 values
                                                       if you want to have
                                                       separate lat,lon
                                                       resolutions



                             or
                  newY/newlat        [lat vector]      bin into new lat grid
                  newX/newlon        [lon vector]      bin into new lon grid

              OPTIONAL
                  region        [blcorner,trcorner] (lat1,lon1,lat2,lon2)



          If region is given, the function will only bin the data within the
          given region. Particularily useful if you have high resolution data
          in a region. The default grid domain is assumed to be global
          otherwise.

  lonspace    %f     Optional for lat/lon data to indicate the desired longitude format
                     lonspace = 360 for lon=0:360,
                     lonspace = 180 for lon=-180:180
                     lonspace = 0 do nothing (default)

           If lonspace is not given, the function automatically picks a format
           depending on if there are any longitude values larger than 180
           (lon>180). This may be important if your input data does not cover both
           hemispheres.

  counts   %f (e.g. true=1)
           Optional logical for if you only want the number of counts per bin, e.g. to see how
           often a certain combination of X-Y happens (numel per box).

 Out:
                   1) Z  ( cell (size(Y,X)) or counts) %data
                   2) Y  ( vector )          %lat
                   3) X  ( vector )          %lon

 USAGE: [Z,Y,X] = binning_fast(struct('data',data,'lat',lat,'lon',lon...
                             'newlat',[-80+.5:80-.5],'newlon',[-180+.5:180-.5]))

        or
         [Z,Y,X] = binning_fast(struct('Z',data,'X',x,'Y',y,'newX',[1:10],'newX',[1:10]))

 NOTE:
       1) Including newlon and newlat causes the function to ignore: in.gridsize
                                                                    in.region
                                                                    lonspace

       2) A useful function to get statistics (e.g numel, mean, sum, etc) on
       the binned data is binned_statistics.

 See also: bin, bin_nd, binned_statistics

 Created by Oliver Lemke and Salomon Eliasson
 $Id$

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

DOWNLOAD ^

binning_fast.m

SOURCE CODE ^

0001 function [Z,Y,X] = binning_fast(in,lonspace)
0002 %% BINNING_FAST bins ungridded data onto a grid
0003 %
0004 % Purpose:
0005 % Bins ungridded data to a cell grid using d = in.gridsize. Each bin will contain a
0006 % vector of values for that bin.
0007 %
0008 % The structure elements lon,lat,data (or X,Y,Z) must be the same size, and the function
0009 % will preserve the variable class.
0010 %
0011 % IN
0012 %   in       struct containing:
0013 %               MANDITORY:
0014 %                   Z/data         [data matrix]     of any class. size(data,1)
0015 %                                                  must be the same length
0016 %                                                  as lat lon, but data may
0017 %                                                  have 2 dimensions or more, e.g.
0018 %                                                  lvls, seasons, etc.
0019 %
0020 %                   X/lon          [lon matrix]      of any class
0021 %                   Y/lat          [lat matrix]      of any class
0022 %
0023 %                 either:
0024 %                   gridsize     [%f] or [%f %f]   scalar for square
0025 %                                                       scaling. 2 values
0026 %                                                       if you want to have
0027 %                                                       separate lat,lon
0028 %                                                       resolutions
0029 %
0030 %
0031 %
0032 %                             or
0033 %                  newY/newlat        [lat vector]      bin into new lat grid
0034 %                  newX/newlon        [lon vector]      bin into new lon grid
0035 %
0036 %              OPTIONAL
0037 %                  region        [blcorner,trcorner] (lat1,lon1,lat2,lon2)
0038 %
0039 %
0040 %
0041 %          If region is given, the function will only bin the data within the
0042 %          given region. Particularily useful if you have high resolution data
0043 %          in a region. The default grid domain is assumed to be global
0044 %          otherwise.
0045 %
0046 %  lonspace    %f     Optional for lat/lon data to indicate the desired longitude format
0047 %                     lonspace = 360 for lon=0:360,
0048 %                     lonspace = 180 for lon=-180:180
0049 %                     lonspace = 0 do nothing (default)
0050 %
0051 %           If lonspace is not given, the function automatically picks a format
0052 %           depending on if there are any longitude values larger than 180
0053 %           (lon>180). This may be important if your input data does not cover both
0054 %           hemispheres.
0055 %
0056 %  counts   %f (e.g. true=1)
0057 %           Optional logical for if you only want the number of counts per bin, e.g. to see how
0058 %           often a certain combination of X-Y happens (numel per box).
0059 %
0060 % Out:
0061 %                   1) Z  ( cell (size(Y,X)) or counts) %data
0062 %                   2) Y  ( vector )          %lat
0063 %                   3) X  ( vector )          %lon
0064 %
0065 % USAGE: [Z,Y,X] = binning_fast(struct('data',data,'lat',lat,'lon',lon...
0066 %                             'newlat',[-80+.5:80-.5],'newlon',[-180+.5:180-.5]))
0067 %
0068 %        or
0069 %         [Z,Y,X] = binning_fast(struct('Z',data,'X',x,'Y',y,'newX',[1:10],'newX',[1:10]))
0070 %
0071 % NOTE:
0072 %       1) Including newlon and newlat causes the function to ignore: in.gridsize
0073 %                                                                    in.region
0074 %                                                                    lonspace
0075 %
0076 %       2) A useful function to get statistics (e.g numel, mean, sum, etc) on
0077 %       the binned data is binned_statistics.
0078 %
0079 % See also: bin, bin_nd, binned_statistics
0080 %
0081 % Created by Oliver Lemke and Salomon Eliasson
0082 % $Id$
0083 
0084 [in,lonlat] = check_input(in);
0085 
0086 if nargin==1, lonspace=0;end
0087 [in,Y,X] = outputlatlons(in,lonspace,lonlat);
0088 
0089 if ~in.counts
0090     % data might be 2D or more e.g. levels,etc
0091     sz = size(in.Z);
0092     in.Z = in.Z(:,:); %collapse data
0093 end
0094 
0095 in = regionchop(in);
0096 
0097 % setup new GRIDSIZE
0098 d(1) = mean(diff(Y));
0099 d(2) = mean(diff(X));
0100 
0101 %% assign index value
0102 Yindex = floor( (in.Y - (min(Y)-d(1)/2)) /d(1) ) + 1;
0103 Xindex = floor( (in.X - (min(X)-d(2)/2)) /d(2) ) + 1;
0104 
0105 % make sure Xindex and Yindex don't exceed maximum
0106 Yindex(Yindex>length(Y))=length(Y);
0107 Xindex(Xindex>length(X))=length(X);
0108 
0109 
0110 %% GET COUNTS
0111 Z = cell(length(Y), length(X));
0112 counts = zeros (size(Z), 'uint32');
0113 for i = 1:length(in.X)
0114     iY = Yindex(i);
0115     iX = Xindex(i);
0116     counts(iY, iX) = counts(iY, iX) + 1;
0117 end
0118 
0119 if in.counts
0120     % If you only want the counts you can leave now
0121     Z = counts;
0122     return
0123 end
0124 
0125 indata = in.Z;
0126 
0127 %% PREALLOCATE DATA
0128 for i = 1:size(counts,1)
0129     for j = 1:size(counts,2)
0130         Z{i, j} = zeros ([counts(i, j), sz(2:end)], class(indata));
0131     end
0132 end
0133 
0134 %% LOOP DATA
0135 for i = 1:size(indata,1)
0136     iY = Yindex(i);
0137     iX = Xindex(i);
0138     Z{iY, iX}(counts(iY, iX),:) = indata(i,:);
0139     counts(iY, iX) = counts(iY, iX) - 1;
0140 end
0141 end
0142 %% Subfunctions
0143 % ||
0144 % VV
0145 
0146 function [in,lonlat] = check_input(in)
0147 
0148 lonlat = all(isfield(in,{'lat','lon'}));
0149 if ~isfield(in,'counts') % if you only want the counts
0150     in.counts = 0;
0151 end
0152 
0153 if lonlat
0154     in.X = in.lon;
0155     in.Y = in.lat;
0156     in = rmfield(in,{'lon','lat'});
0157     if ~in.counts
0158         in.Z = in.data;
0159         in = rmfield(in,'data');
0160     end
0161     if isfield(in,'newlon')
0162         in.newX = in.newlon;
0163         in.newY = in.newlat;
0164         in = rmfield(in,{'newlon','newlat'});
0165     end
0166 end
0167 
0168 errId = ['atmlab:' mfilename ':badInput'];
0169 assert(all(isfield(in,{'Y','X','Z'})) || all(isfield(in,{'Y','X','counts'})),...
0170     errId,'''lat'',''lon'',''data'' (or ''X'',''Y'',''Z'') are required input fields')
0171 
0172 assert(isfield(in,'gridsize') || all(isfield(in,{'newX','newY'})),...
0173     errId,'Either field ''gridsize'' or fields ''newY/newlat'' and ''newX/newlon'' are required' )
0174 
0175 if ~in.counts
0176     if isequal(numel(in.Z),numel(in.Y),numel(in.X))
0177         in.Z = in.Z(:); in.Y = in.Y(:); in.X = in.X(:);
0178     end
0179 
0180     assert(isequal(size(in.X,1),size(in.Y,1),size(in.Z,1)),...
0181     errId, ['data,Y,X must all have the same first dimension. Found: ' ...
0182     'data %d, Y %d, X %d'], size(in.Z, 1), size(in.Y, 1), size(in.X, 1));
0183 else
0184     if isequal(numel(in.Y),numel(in.X))
0185         in.Y = in.Y(:); in.X = in.X(:);
0186     end
0187     assert(isequal(size(in.X,1),size(in.Y,1)),...
0188     errId, ['data,Y,X must all have the same first dimension. Found: ' ...
0189     ' Y %d, X %d'], size(in.Y,1), size(in.X, 1));
0190 end
0191 
0192 
0193 
0194 if lonlat
0195     assert(~(any([in.newX(:)>360;in.newX(:)<-180;in.newY(:)>90;in.newY(:)<-90])),...
0196         errId,'lat/lon values are unphysical')
0197     
0198     if ~(max(diff(in.newY))-min(diff(in.newY))<1e-4 && ...
0199             max(diff(in.newX))-min(diff(in.newX))<1e-4)
0200         warning(['atmlab:' mfilename, ':WonkyNewgrid'],[...
0201             'vectors in fields: ''newY'' and ''newX'' are not monotonously spaced. ',...
0202             'The output grid will be monotonously spaced, ',...
0203             'based on mean(diff(newY)) and  mean(diff(newX))'])
0204     end
0205     if isfield(in,'gridsize')
0206         warning(errId,'fields ''newX'' and ''newY'' cancel ''gridsize''')
0207     end
0208     if isfield(in,'region')
0209         warning(errId,'Region specified by newY and newX has presedence over in.region\n')
0210     end
0211 end
0212 
0213 %internally span a region, so that values outside this region can be kicked out
0214 % [blcorner,trcorner] (lt1,ln1,lt2,ln2). This should be safe for both
0215 % centered and non-centered grids
0216 dlt = mean(diff(in.newY))/2;  dln = mean(diff(in.newX))/2;
0217 in.region = [min(in.newY)-dlt,min(in.newX)-dln,max(in.newY)+dlt,max(in.newX)+dln];
0218 
0219 end
0220 function in = regionchop(in)
0221 % Get rid of all data outside of region
0222 
0223 lt1 = in.Y > in.region(1);
0224 ln1 = in.X > in.region(2);
0225 lt2 = in.Y < in.region(3);
0226 ln2 = in.X < in.region(4);
0227 
0228 index = ln1 & ln2 & lt1 & lt2;
0229 
0230 in.Y = in.Y(index);
0231 in.X = in.X(index);
0232 
0233 if ~in.counts
0234     in.Z = in.Z(index,:);
0235 end
0236 
0237 end
0238 function [in,Y,X] = outputlatlons(in,lonspace,lonlat)
0239 %% fix the latlons
0240 % Sets up lat lon and adjusts in.X to the requested (if requested)
0241 % longitude regime (0:360 or -180:180)
0242 %
0243 errId = ['atmlab:' mfilename ':badInput'];
0244 
0245 if lonlat
0246     testmore = in.X>180; %logical for any lons > 180
0247     testless = in.X<0; %logical for any lons < 0
0248     
0249     if lonspace==360
0250         in.X(testless) = in.X(testless)+360;
0251         in.f360=true;
0252     elseif lonspace==180
0253         in.X(testmore) = in.X(testmore)-360;
0254         in.f360=false;
0255     elseif lonspace==0
0256         in.f360 = any(testmore(:));
0257     else
0258         error(errId,...
0259             'lonspace must be 180, 360 or 0 (default (do nothing))')
0260     end
0261     
0262     if in.f360 && ~isfield(in,'region')
0263         in.region = [-90 0 90 360];
0264     elseif ~in.f360 && ~isfield(in,'region')
0265         in.region = [-90 -180 90 180];
0266     end
0267 end
0268 
0269 if any(isfield(in,{'newX','newX'}))
0270     Y = in.newY;
0271     X = in.newX;
0272     if lonspace~=0
0273         warning(errId,'fields ''newY'' and ''newX'' make lonspace irrelevant')
0274     end
0275     return
0276 end
0277 
0278 d = in.gridsize;
0279 assert(numel(d)<=2,errId,'''gridsize'' should contain 1 or 2 values')
0280 
0281 if isscalar(d), d = [d,d]; end
0282 assert(isnumeric(in.region),errId,'in.region must be a numeric vector of coordinates')
0283 
0284 Y = in.region(1)+0.5*d(1):d(1):in.region(3)-0.5*d(1);
0285 X = in.region(2)+0.5*d(2):d(2):in.region(4)-0.5*d(2);
0286 
0287 end

Generated on Mon 15-Sep-2014 13:31:28 by m2html © 2005