Home > atmlab > collocations > CollocatedDataset.m

CollocatedDataset

PURPOSE ^

SYNOPSIS ^

This is a script file.

DESCRIPTION ^

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

DOWNLOAD ^

CollocatedDataset.m

SOURCE CODE ^

0001 classdef CollocatedDataset < HomemadeDataset
0002     % Defines a collocated dataset, between two (different) datasets.
0003     %
0004     % A CollocatedDataset consists of two <a href="matlab:help SatDataset">SatDataset</a>s.
0005     % To start collocating, first create two <a href="matlab:help SatDataset">SatDataset</a>
0006     % objects and define all required properties there. Predefined datasets
0007     % are defined in <a href="matlab:help define_datasets">define_datasets</a>. Then <a href="matlab:help CollocatedDataset/CollocatedDataset">create</a> a CollocatedDataset.
0008     % To list existing datasets, use <a href="matlab:help datasets">datasets</a>.
0009     % Remember that CollocatedDataset inherits from <a href="matlab: help
0010     % HomemadeDataset">HomemadeDataset</a>,
0011     % so it has all capabilities that HomemadeDataset has. You may also
0012     % want to look at the Collocation User's Guide.
0013     %
0014     % CollocatedDataset Properties:
0015     %
0016     %   primary -   primary dataset
0017     %   secondary - secondary dataset
0018     %   distance -  maximum collocation distance (km)
0019     %   interval -  maximum collocation time (seconds)
0020     %   members -   structure containing full info on how data are stored
0021     %               (as well as all <a href="matlab:help SatDataset">SatDataset</a> and <a href="matlab:help HomemadeDataset">HomemadeDataset</a> properties)
0022     %   pcd -       if set, cache <a href="matlab:help CollocatedDataset/read">read</a> results to disk
0023     %   (see <a href="matlab:properties CollocatedDataset">properties</a> for a complete listing)
0024     %   gridsize -  parameter for calculating collocations. Set small if
0025     %               you expect many collocations, large if you expect few.
0026     %
0027     % CollocatedDataset methods:
0028     %
0029     %  Class instantiation:
0030     %
0031     %   CollocatedDataset -         created CollocatedDataset object
0032     %
0033     %  Finding and storing collocations:
0034     %
0035     %   overlap_granule -           Find overlapping granules from filenames
0036     %   collocate_granule -         Collocate primary with all secondaries
0037     %   collocate_date -            Collocate all granules in day
0038     %   collocate_and_store_date -  Collocate entire day and store results
0039     %   collocate_and_store_date_range - Collocate many days and store
0040     %
0041     %  Reading collocations:
0042     %
0043     %   read -                      Read collocations for date range
0044     %   list_fields -               Find what fields can be read
0045     %
0046     %  (as well as all <a href="matlab:help SatDataset">SatDataset</a> and <a href="matlab:help HomemadeDataset">HomemadeDataset</a> methods)
0047     %
0048     % For a full overview of methods, see <a href="matlab:doc CollocatedDataset">doc CollocatedDataset</a>.
0049     %
0050     % Example:
0051     %
0052     %   >> D = datasets;
0053     %   >> mhscpr = CollocatedDataset(D.mhs, D.cpr, 'distance', 10, 'interval', 300, 'name', 'NarrowCPRMHSCollocs');
0054     %
0055     % Hint: if the collocations are slow, try tweaking <a href="matlab:help
0056     % CollocatedDataset/gridsize">gridsize</a>.
0057     %
0058     % See also: SatDataset (grandparent), HomemadeDataset (superclass), AssociatedDataset,
0059     %           FieldCopier, Collapser,
0060     %
0061     %
0062     % Don't forget the Collocation User's Guide.
0063     %
0064     % $Id: CollocatedDataset.m 8750 2013-12-07 18:14:32Z seliasson $
0065     properties (Transient)
0066         % Primary dataset.
0067         %
0068         % This is usually a <a href="matlab:help SatDataset">SatDataset</a>.
0069         %
0070         % When collocating, there are some considerations as to chosing
0071         % the primary vs. the secondary dataset:
0072         %
0073         %   - Choose the dataset with the largest footprint as the primary.
0074         %     This becomes particularly important when the secondary is
0075         %     averaged over the primary using <a href="matlab:help Collapser"</a>Collapser</a> or a similar class.
0076         %
0077         %   - Choose the dataset with the largest granule as the primary.
0078         %     As long as there is no caching implemented in <a href="matlab: help SatDataset/read_granule">SatDataset.read_granule</a>,
0079         %     this is much faster than the opposite.
0080         primary;
0081         
0082         % Secondary dataset
0083         %
0084         % For considerations, see help for <a href="matlab:help CollocatedDataset/primary">primary</a>.
0085         secondary;
0086         
0087         % Maximum distance in km
0088         %
0089         % The maximum distance that is considered a collocation. Distance
0090         % is determined between footprint geolocations, in lat/lon.
0091         distance;
0092         
0093         % Maximum time-interval in seconds
0094         %
0095         % The maximum time interval that is considered a collocation.
0096         interval;
0097         
0098         % Grid-size (in degrees) to use inside collocationg algorithm
0099         %
0100         % The size of the grid that is used inside the collocation
0101         % algorithm. The method <a href="matlab:help CollocatedDataset/collocate">collocate</a> works by gridding the
0102         % measurements on a equirectangular lat-lon grid (see algorithm
0103         % description in John. et al (2012)), using <a href="matlab:help binning_fast">binning_fast</a>.
0104         % The speed of binning_fast is a injective function of the gridsize
0105         % with a strictly positive derivative. The speed does not depend on
0106         % the number of collocations. In the next step, all distances
0107         % between pairs within a grid-cell are calculated. This step
0108         % strongly depends on the number of collocations. Therefore, to
0109         % optimise the speed of calculating collocations, set the grid_size
0110         % small if you expect many collocations and large if you expect
0111         % very few.
0112         %
0113         % It has a default value of 1 (degree).
0114         gridsize = 1;
0115         
0116         % Logfile for describing what collocations were performed
0117         %
0118         % When collocations are read and written to disk, the toolkit will
0119         % write a short report to this file reporting what collocations
0120         % were located etc. This is used only by
0121         % <a href="matlab:help CollocatedDataset/collocate_and_store_date_range">collocate_and_store_date_range</a>.
0122         % The logfile is put directly in self.basedir and appended to on
0123         % every occasion.
0124         logfile = 'collocations_log';
0125         
0126         % Marker to seperate between different logfile entries
0127         marker = [repmat('-', [1 72]) '\n'];
0128 
0129         % Collocation method: classical or experimental
0130         %
0131         % If things go wrong in 'collocate', set to 'classical'.
0132         binning = 'experimental'
0133     end
0134     
0135     % read-only props (may be alterable with methods)
0136     properties (Transient, SetAccess = protected)
0137         % Cell array of <a href="matlab:help AssociatedDataset">AssociatedDataset</a>.
0138         %
0139         % This automatically-filled cell-array contains all datasets that
0140         % can be used in associated with this <a href="matlab:help CollocatedDataset">CollocatedDataset</a>.
0141         % This can be e.g. <a href="matlab:help FieldCopier">FieldCopier</a>, <a href="matlab:help Collapser">Collapser</a>, etc.
0142         associated = {};
0143     end
0144     
0145     properties (Transient, Constant)
0146         
0147         % Structure describing layout of NetCDF files storing collocations
0148         %
0149         % This read-only structure describes exactly the layout of the
0150         % NetCDF files in which the collocations are stored.
0151         %
0152         % Valid types are at:
0153         % http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c/NetCDF_002d3-Variable-Types.html#NetCDF_002d3-Variable-Types
0154         members_const = struct(...
0155             'START1', struct(...
0156                 'type', 'int', ... % won't work after 2038-01-19 03:14:07 UTC
0157                 'atts', struct(...
0158                     'long_name', 'primary granule start time', ...
0159                     'units', 'seconds since 1970-01-01 00:00:00')), ...
0160             'LINE1', struct(...
0161                 'type', 'int', ...
0162                 'atts', struct(...
0163                     'long_name', 'primary scanline number')), ...
0164             'POS1', struct(...
0165                 'type', 'int', ...
0166                 'atts', struct(...
0167                     'long_name', 'primary scanline position')), ...
0168             'TIME1', struct(...
0169                 'type', 'int', ...
0170                 'atts', struct(...
0171                     'long_name', 'primary measurement time', ...
0172                     'units', 'seconds since 1970-01-01 00:00:00')), ...
0173             'LAT1', struct(...
0174                 'type', 'double', ...
0175                 'atts', struct(...
0176                     'long_name', 'primary latitude', ...
0177                     'units', 'degrees_north', ...
0178                     'valid_range', [-90 90])), ...
0179             'LON1', struct(...
0180                 'type', 'double', ...
0181                 'atts', struct(...
0182                     'long_name', 'primary longitude', ...
0183                     'units', 'degrees_east', ...
0184                     'valid_range', [-180 180])), ...
0185             'START2', struct(...
0186                 'type', 'int', ...
0187                 'atts', struct(...
0188                     'long_name', 'secondary granule start time', ...
0189                     'units', 'seconds since 1970-01-01 00:00:00')), ...
0190             'LINE2', struct(...
0191                 'type', 'int', ...
0192                 'atts', struct(...
0193                     'long_name', 'secondary scanline number')), ...
0194             'POS2', struct(...
0195                 'type', 'int', ...
0196                 'atts', struct(...
0197                     'long_name', 'secondary scanline position')), ...
0198             'TIME2', struct(...
0199                 'type', 'int', ...
0200                 'atts', struct(...
0201                     'long_name', 'secondary measurement time', ...
0202                     'units', 'seconds since 1970-01-01 00:00:00')), ...
0203             'LAT2', struct(...
0204                 'type', 'double', ...
0205                 'atts', struct(...
0206                     'long_name', 'secondary latitude', ...
0207                     'units', 'degrees_north', ...
0208                     'valid_range', [-90 90])), ...
0209             'LON2', struct(...
0210                 'type', 'double', ...
0211                 'atts', struct(...
0212                     'long_name', 'secondary longitude', ...
0213                     'units', 'degrees_east', ...
0214                     'valid_range', [-180 180])), ...
0215             'DIST', struct(...
0216                 'type', 'float', ...
0217                 'atts', struct(...
0218                     'long_name', 'Distance secondary to primary', ...
0219                     'units', 'km')), ...
0220             'INT', struct(...
0221                 'type', 'float', ...
0222                 'atts', struct(...
0223                     'long_name', 'Time secondary minus time primary', ...
0224                     'units', 'seconds')), ...
0225             'INDEX', struct(...
0226                 'type', 'int', ...
0227                 'atts', struct(...
0228                     'long_name', 'Collocation index number')));
0229         
0230     end
0231     
0232     properties (Transient, GetAccess = private, SetAccess = private)
0233         log;
0234         success = false;
0235     end
0236     
0237     methods
0238         %% constructor
0239 
0240         function self = CollocatedDataset(prim, sec, varargin)
0241             % Constructor for CollocatedDataset
0242             %
0243             % FORMAT
0244             %
0245             %   cd = CollocatedDataset(prim, sec, 'prop', 'val', ...)
0246             %
0247             % IN
0248             %
0249             %   prim    SatDataset  primary
0250             %   sec     SatDataset  secondary
0251             %   ... (key - value) pairs as for <a href="matlab:help SatDataset">SatDataset</a>
0252             %   for a CollocatedDataset, of particular importance are
0253             %   'distance' and 'interval'.
0254             %
0255             % OUT
0256             %
0257             %   cd      CollocatedDataset object
0258             super_args = [varargin, {'primary', prim, 'secondary', sec}];
0259             self = self@HomemadeDataset(super_args{:}); % call parent constructor
0260             % check that both are SatDataset
0261             for x = {prim, sec}
0262                 assert(isa(x{1}, 'SatDataset'), ['atmlab:' mfilename ':TypeError'], ...
0263                     'Collocations must be between 2 x SatDataset, got %s instead', ...
0264                      class(x{1}));
0265             end
0266             
0267             % set distance and interval to global default if not given
0268             for f = {'distance', 'interval'}
0269                 if isempty(self.(f{1}))
0270                     warning(['atmlab:' mfilename ':notgiven'], ...
0271                         'Field ''%s'' not given for %s, setting to global default %d', ...
0272                         f{1}, self.name, colloc_config(f{1}));
0273                     self.(f{1}) = colloc_config(f{1});
0274                 end
0275             end
0276 
0277             % add both to corresponding datasets
0278             for x = {prim, sec}
0279                 x{1}.add_collocated_dataset(self);
0280             end
0281             
0282             % make 'cols'-structure for internal data specification
0283              
0284             self.members = self.members_const;
0285             fnms = fieldnames(self.members);
0286             nfields = length(fnms);
0287             fv = 1:nfields;
0288             self.cols = cell2struct(num2cell(fv), fnms.', 2);
0289             
0290             if isempty(self.granule_duration)
0291                 self.granule_duration = 86400;
0292             end
0293         end
0294                 
0295         %% implement new methods
0296         
0297         function secondary_granules = overlap_granule(self, date1, spec2)
0298             % secondary granules that, judging from filenames, might overlap
0299             %
0300             % Based on the filename and on the <a href="matlab:help SatDataset/granule_duration">granule_duration</a>
0301             % properties, guess what secondary granules need to be read in
0302             % order to fully cover the period indicated by the first
0303             % granule. This is the first step in performing collocations
0304             % for a particular granule.
0305             %
0306             % This is a low-level function usually not called directly
0307             % by the user.
0308             %
0309             % Assumes granules contain data for at most one day.
0310             % This implies a limitation for the entire collocation toolkit.
0311             %
0312             % FORMAT
0313             %
0314             %   grans = cd.overlap_granule(datevec1, spec2);
0315             %
0316             % IN
0317             %
0318             %   datevec1    datevec     Starting time for primary granule
0319             %   spec2       various     Satellite etc. for secondary granule
0320             %
0321             % OUT
0322             %
0323             %   grans       matrix      describing secondary start-times
0324             %
0325             % EXAMPLE
0326             %
0327             %   >> sec = mhscpr.overlap_granule([2010 9 8 8 8], '');
0328             
0329             
0330             %% find starting time in unixsecs
0331             
0332             primary_unixsecs = self.primary.get_starttime(date1);
0333             %granule_end = primary_unixsecs + duration;
0334             
0335             %% find all granules yesterday/today/tomorrow
0336             
0337             today_num = datenum(date1(1), date1(2), date1(3));
0338             yesterday = datevec(today_num - 1);
0339             tomorrow = datevec(today_num + 1);
0340             
0341             threedays = [...
0342                 self.secondary.find_granules_by_date(yesterday, spec2, false); ...
0343                 self.secondary.find_granules_by_date(date1, spec2, false); ...
0344                 self.secondary.find_granules_by_date(tomorrow, spec2, false)];
0345             threedays = sortrows(threedays);
0346             % get corresponding starting times
0347             secondary_unixsecs = zeros(1, size(threedays, 1));
0348             for i = 1:size(threedays, 1)
0349                 st = self.secondary.get_starttime(threedays(i, :));
0350                 secondary_unixsecs(i) = st;
0351             end
0352 %             granules_unixsecs = date2unixsecs(...
0353 %                 threedays(:, 1), threedays(:, 2), threedays(:, 3), threedays(:, 4), threedays(:, 5));
0354 
0355 %            primstart_before_secstart = primary_unixsecs < secondary_unixsecs;
0356             
0357             primstart_before_secend = primary_unixsecs < secondary_unixsecs + self.secondary.granule_duration + self.interval;
0358             
0359             primend_after_secstart = primary_unixsecs + self.primary.granule_duration + self.interval > secondary_unixsecs;
0360             
0361             okej = primstart_before_secend & primend_after_secstart;
0362             
0363 %{
0364             
0365             
0366             % first granule
0367 
0368             % last granule starting after the reference granule
0369             % those are all 'too late'
0370             
0371             toolate = primary_unixsecs - self.interval > secondary_unixsecs;
0372             b = find(toolate, 1, 'last');
0373             
0374             % first granule ending before the reference granule
0375             % those are all 'too early'
0376             
0377             tooearly = granule_end + self.interval < secondary_unixsecs;
0378             e = find(tooearly, 1, 'first');
0379             
0380             % if all match, none are too early or too late
0381             if isempty(b) && isempty(e)
0382                 okej = (primary_unixsecs < secondary_unixsecs) & (granule_end > secondary_unixsecs);
0383             elseif ~isempty(b) && ~isempty(e)
0384                 % anything not too early, not too late, is good
0385                 okej = b:e;
0386             elseif (all(tooearly) && none(toolate)) || ...
0387                    (all(toolate) && none(tooearly))
0388                 % all too early, none too late
0389                 % all too late, none too early
0390                 % this means all are off
0391                 okej = [];
0392             else
0393                 % this means:
0394                 % if b is empty, then 1:e
0395                 % if e is empty, then b:end
0396                 okej = max([~isempty(b)+1 b]):max([isempty(e)*size(threedays, 1) e]);
0397                 
0398                 %X={b,e};
0399                 %okej = X{[~isempty(b) ~isempty(e)]};
0400             end   
0401 %}
0402             secondary_granules = threedays(okej, :);
0403         end
0404 
0405         function [result, additional_results, also] = collocate_granule(self, date1, spec1, spec2, varargin)           
0406             % collocate_granule Collocate granules from sat/sensor pairs
0407             %
0408             % This method collocates a granule from the primary with all
0409             % overlapping granules from the secondary satellite, and
0410             % returns the results.
0411             %
0412             % FORMAT
0413             %
0414             %   [result, additional_results, also] = ...
0415             %       cd.collocate_granule(date1, spec1, spec2[, additionals[, oneday]])
0416             %
0417             % IN
0418             %
0419             %   date1   datevec     starting date/time for primary
0420             %   spec1   various     specification, e.g. satellite or pair
0421             %                       of satellites, for the primary. May be
0422             %                       empty if not applicable.
0423             %   spec2   various     specification for the secondary
0424             %   addis   cell-array  optional; cell array of objects from
0425             %                       classes implementing <a href="matlab:help AdditionalDataset">AdditionalDataset</a>,
0426             %                       such as <a href="matlab:help FieldCopier">FieldCopier</a>
0427             %                       or <a href="matlab:help Collapser">Collapser</a>.
0428             %                       Defaults to empty cell-array {}.
0429             %   oneday  logical     optional. If true, force collocations to be all
0430             %                       from the indicated day. If false, all
0431             %                       collocations from all granules covering
0432             %                       this day will be considered. Defaults
0433             %                       to false, but if one is looping through
0434             %                       the days one wants it to be true.
0435             %
0436             % OUT
0437             %
0438             %   result  matrix      Contains core collocation-info. Columns
0439             %                       described by <a href="matlab:help CollocatedDataset/cols">CollocatedDataset.cols</a>.
0440             %   add_res cell-array  Cell array of matrices. The elements of
0441             %                       this cell-array correspond to the
0442             %                       elements of input arguments 'addis'.
0443             %                       For each matrix, the columns are
0444             %                       described by the corresponding <a href="matlab:help AssociatedDataset/cols">cols</a>
0445             %                       member for the <a href="matlab:help AssociatedDataset">AssociatedDataset</a>
0446             %                       object.
0447             %   also    struct      More information.
0448             %
0449             % EXAMPLE
0450             %
0451             %   (example assumes fc is a <a href="matlab:help FieldCopier">FieldCopier</a>)
0452             %
0453             % >> [result, additional_results, also] = ...
0454             %   mhscpr.collocate_granule([2010 9 8 8 8], 'noaa18', '', {fc}, true)
0455             
0456             [additionals, oneday] = optargs(varargin, {{}, false});
0457             self.verify_addis(additionals);
0458             additional_results = cell(size(additionals));
0459             
0460             additionals = self.sort_additionals(additionals);
0461             
0462             % collect all extra fields, for all additional datasets
0463             extra_args_primary = {};
0464             extra_args_secondary = {};
0465             for i = 1:length(additionals)
0466                 addi = additionals{i};
0467                 primfields = addi.primary_arguments();
0468                 secofields = addi.secondary_arguments();
0469                 extra_args_primary = [extra_args_primary primfields]; %#ok<AGROW>
0470                 extra_args_secondary = [extra_args_secondary secofields]; %#ok<AGROW>
0471             end
0472             extra_args_primary = unique(extra_args_primary);
0473             extra_args_secondary = unique(extra_args_secondary);
0474             
0475             %% output
0476             
0477             fid = atmlab('OUT');
0478             eid = atmlab('ERR');
0479             
0480             %% find date/times for which to search for secondary granules
0481             
0482             % overlap check 1: primary filename, secondary filename
0483             secondary_granules = self.overlap_granule(date1, spec2);
0484             
0485             if size(secondary_granules, 1)==0
0486                 error(['atmlab:' mfilename ':noother'], ...
0487                     'No overlapping granules found for [%s] %s/%s with %s/%s', ...
0488                     num2str(date1), spec1, self.primary.name, spec2, self.secondary.name);
0489             end
0490             
0491             logtext(fid, 'Found %d other granules to collocate with\n', size(secondary_granules, 1));
0492             
0493             %% pre-allocate
0494             
0495             result = zeros(0, length(fieldnames(self.cols)), self.mattype);
0496             
0497             %% read data for sat1/sensor1
0498             % Here, I decide to keep the duplicates for now. Simple is better than
0499             % complex. My collocation output are rows and column indices, and if I already
0500             % remove duplicates /before/ collocation, the row-indices will be
0501             % incorrect/ need to be corrected for. This causes more problems than it
0502             % solves, so I will remove the duplicates in the postprocessing instead.
0503             try
0504                 data1 = self.primary.read_granule(date1, spec1, extra_args_primary, false, false); % keep duplicates for now
0505             catch ME
0506                 switch ME.identifier
0507                     case {'MATLAB:load:couldNotReadFile', 'atmlab:invalid_data', 'atmlab:SatDataset:cannotread'}
0508                         logtext(eid, 'Collocating failed: %s\n', ME.message);
0509                         also = struct('version', {{'?', '?'}});
0510                         return
0511                     otherwise
0512                         ME.rethrow();
0513                 end
0514             end
0515             
0516             also.version{1} = data1.version;
0517             also.version{2} = '?'; % secondary version not yet available
0518             
0519             if isempty(data1.time)
0520                 logtext(fid, 'Primary is empty, nothing to collocate\n');
0521                 %                self.version{2} = 'N/A';
0522                 return
0523             end
0524             
0525             % keep effective range, check time overlap before actually reading the
0526             % secondary data, because the primary might be sparse. E.g. when judging
0527             % from the filenames the granules contain overlapping periods, but if the
0528             % primary in reality only contains data for a certain part of this period
0529             % (e.g. in the case of collocations), should check the primary data with
0530             % the secondary filename in order not to read files one knows will not
0531             % contain overlap anyway.
0532             
0533             % self.primary and self.secondary may actually be the same
0534             % object, do not store any 'temporary' data there.
0535             data1.time_orig = data1.time; % store because of subsequent unify_time_axis can lead to errors
0536             data1.eff_range = double(data1.epoch) + ...
0537                 [min(data1.time)-self.interval, ...
0538                 max(data1.time)+self.interval];
0539             
0540             %% loop through all the granules to collocate with
0541             
0542             for i = 1:size(secondary_granules, 1)
0543                 % Overlap check 2; primary data with secondary filename
0544                 data2_start = self.secondary.get_starttime(secondary_granules(i, :));
0545                 data2_end = data2_start + self.secondary.granule_duration;
0546                 
0547                 [datecell{1:5}] = unixsecs2date(data2_start);
0548                 
0549                 logtext(atmlab('OUT'), 'Collocating with ''%s'' ''%s'' @ %04d-%02d-%02d %02d:%02d\n', ...
0550                     self.secondary.name, spec2, datecell{1:5});
0551                 
0552                 
0553                 if (data2_start > data1.eff_range(2)) || (data2_end < data1.eff_range(1))
0554                     % no possible time overlap; data1 range already compensated for
0555                     % self.interval, no use even reading granule2
0556                     logtext(atmlab('OUT'), 'Not reading, no overlap with primary\n');
0557                     continue
0558                 end
0559                 
0560                 %% read secondary
0561                 
0562                 try
0563                     data2 = self.secondary.read_granule(secondary_granules(i, :), spec2, extra_args_secondary, false);
0564                 catch ME
0565                     switch ME.identifier
0566                         case {'atmlab:find_datafile_by_date', ...
0567                                 'atmlab:invalid_data', ...
0568                                 'atmlab:find_granule_by_datetime', ...
0569                                 'atmlab:exec_system_cmd:shell', ...
0570                                 'atmlab:rqre_in_range:invalid', ...
0571                                 'atmlab:SatDataset:cannotread'}
0572                             logtext(eid, 'Error in reading datafile %4d-%02d-%02d %02d:%02d: %s. SKIPPING\n', ...
0573                                 datecell{:}, ME.message);
0574                             continue
0575                         otherwise
0576                             ME.rethrow();
0577                     end
0578                 end
0579                 
0580                 also.version{2} = data2.version;
0581                 data2.time_orig = data2.time; % store, see note at data1.time_orig
0582                 if isempty(data2.time)
0583                     logtext(fid, 'Secondary is empty, skipping\n');
0584                     continue
0585                 end
0586                 
0587                 %%
0588                 switch sign(data1.epoch - data2.epoch)
0589                     case -1 % data1 epoch is earliest
0590                         data2.time = data2.time + double(data2.epoch - data1.epoch);
0591                         data2.epoch = data1.epoch;
0592                     case 0 % do nothing
0593                     case 1 % data2 epoch is earliest
0594                         % FIXME: Bug? primary changes also for next loop
0595                         % iteration? Or should be ok as long as
0596                         % epoch/time consistent, maybe will iteratively
0597                         % increase but at least consistent
0598                         data1.time = data1.time + double(data1.epoch - data2.epoch);
0599                         data1.epoch = data2.epoch;
0600                     otherwise
0601                         error(['atmlab:' mfilename ':bug'], 'Reached impossible place. Bug.');
0602                 end
0603                 
0604                 % overlap check 3: primary data, secondary data
0605                 [iv1, iv2] = find_common_time(data1.time, data2.time, self.interval);
0606                 % limit to one day (if needed)
0607                 switch oneday
0608                     case 1
0609                         % only consider data on the same date as the start for data1
0610                         if data1.time(1) < 86400
0611                             sameday = data1.time < 86400;
0612                         else
0613                             sameday = data1.time < 2*86400;
0614                         end
0615                         if ~all(sameday)
0616                             logtext(fid, 'taking only %d/%d that are on the same day\n', ...
0617                                 sum(sameday), length(data1.time));
0618                         end
0619                         iv1 = iv1 & sameday;
0620                     case 2
0621                         % only consider data on date after the start date for data1
0622                         sameday = data1.time > 86400;
0623                         iv1 = iv1 & sameday;
0624                         if ~all(sameday)
0625                             logtext(fid, 'taking from primary only %d/%d that are on the next day\n', ...
0626                                 sum(sameday), length(data1.time));
0627                         end
0628                 end
0629                 if ~(any(iv1) && any(iv2)) % no overlap
0630                     logtext(fid, 'upon checking actual data, no actual time overlap, so nothing to collocate\n');
0631                     continue
0632                 end
0633                 
0634                 %% perform collocations
0635                 logtext(atmlab('OUT'), 'Performing collocations (distance < %g km, time < %g s)\n', ...
0636                                        self.distance, self.interval);
0637                 collocations = self.collocate(data1.time(iv1), ...
0638                     data1.lat(iv1, :), ...
0639                     data1.lon(iv1, :), ...
0640                     data2.time(iv2), ...
0641                     data2.lat(iv2, :), ...
0642                     data2.lon(iv2, :));
0643                 if any(collocations)
0644                     % compensate for the fact that we passed only a subset to collocate
0645                     collocations(:, 1) = collocations(:, 1) + find(iv1, 1, 'first') - 1;
0646                     collocations(:, 3) = collocations(:, 3) + find(iv2, 1, 'first') - 1;
0647                 else
0648                     logtext(fid, 'No collocations\n');
0649                     continue
0650                 end
0651                 %% process core
0652                 % should return a matrix with info
0653 
0654                 %lockfile = get_lock(atmlab('WORK_AREA'), 'collocations.lock');
0655                 %cl1 = onCleanup(@()delete(lockfile));
0656 
0657                 logtext(fid, 'Collecting info for %d collocations\n', size(collocations, 1));
0658                 
0659                 this_result = self.process(collocations, data1, date1, spec1, data2, secondary_granules(i, :), spec2);
0660                 clear collocations;
0661                 old_result = result;
0662                 result = [result; this_result]; %#ok<AGROW>
0663                 % correct INDEX
0664                 if ~isempty(result)
0665                     result(:, self.cols.INDEX) = 1:size(result, 1);
0666                 end
0667                 
0668                 %% process additional
0669                 %
0670                 % additionals are all implementations of AssociatedDataset able
0671                 % to process data and knowing how to store it, how to deal
0672                 % with it, etc.
0673                 % For copying, there is FieldCopier.
0674                 %
0675                 % Note that in collocate_granule, we always overwrite
0676                 % everything.
0677                 
0678                 % this might be memory-intensive, make sure to get a lock
0679                 
0680                 this_additional_result = cell(size(additionals));
0681                 cc = cell(size(additionals));
0682                 for j = 1:length(additionals)
0683                     additional = additionals{j};
0684                     % Get data that this additional depends upon
0685                     
0686                     %depdata = self.fix_dependencies(additional, additionals, this_additional_result);
0687                     [depdata, depcols] = self.fix_dependencies(additional, additionals, this_additional_result, cc);
0688                     [this_additional_result{j}, cc{j}] = additional.process_granule(this_result, data1, date1, spec1, data2, secondary_granules(i, :), spec2, ...
0689                         depdata, depcols, 'all');
0690                         %depdata);
0691                     clear depdata
0692                     % since we use 'all', we can safely set
0693                     % additional.cols
0694                     additional.cols = cc{j};
0695                     %clear this_result
0696                     % use special method to concatenate, sometimes need to
0697                     % correct values, e.g. FIRST, LAST in case of meandata
0698                     additional_results{j} = additional.concatenate(old_result, additional_results{j}, this_additional_result{j});
0699                     %clear this_additional_result;
0700                 end
0701                 clear old_result this_additional_result this_result data2
0702 
0703                 %logtext(atmlab('OUT'), 'Clearing lockfile\n');
0704                 %delete(lockfile);
0705 
0706                 logtext(fid, 'Info collected\n');
0707             end
0708             % return: result, additional_results, also
0709         end
0710         
0711         function [result, additional_results, also] = collocate_date(self, date, spec1, spec2, varargin)           
0712             % collocate_date Collect all collocations for given date
0713             %
0714             % This method takes all granules from the primary that contain
0715             % data for the indicated date. It then <a href="matlab:help CollocatedDataset/collocate_granule">collocates each granule</a>
0716             % with the secondary and thus collects all collocations for the
0717             % given date. The result is not stored, but returned; to store
0718             % the result, use <a href="matlab:help CollocatedDataset/collocate_and_store_date">CollocatedDataset.collocate_and_store_date</a>.
0719             %
0720             % FORMAT
0721             %
0722             %   [M, addis, also] = cd.collocate_date([year, month, day], spec1, spec2[, additionals])
0723             %
0724             % IN
0725             %
0726             %   datevec     vector      [year month day]
0727             %   spec1       various     sat for primary, if applicable
0728             %   spec2       various     sat for secondary, if applicable
0729             %   additionals (optional)  cell array; see help for
0730             %                           corresponding argument for
0731             %                           <a href="matlab:help CollocatedDataset/collocate_granule">CollocatedDataset.collocate_granule</a>.
0732             %
0733             % OUT
0734             %
0735             %   For output arguments, see help for <a href="matlab:help CollocatedDataset/collocate_granule">CollocatedDataset.collocate_granule</a>.
0736             %
0737             % EXAMPLE
0738             %
0739             % (example assumes fc is a <a href="matlab:help FieldCopier">FieldCopier</a>)
0740             % >> [result, additional_results, also] = ...
0741             %   mhscpr.collocate_date([2010 9 8], 'noaa18', '', {fc})
0742             
0743             additionals = optargs(varargin, {{}});
0744             self.verify_addis(additionals);
0745             
0746             fid = atmlab('OUT');
0747             year = date(1);
0748             month = date(2);
0749             day = date(3);
0750             % find granules for primary dataset; if the length equals one day, do not
0751             % take the day before as it's already sorted per day
0752             grans = self.primary.find_granules_by_date(date, spec1, ...
0753                 self.primary.granule_duration~=86400);
0754             if isempty(grans)
0755                 logtext(atmlab('ERR'), 'no granules found %s/%s %d-%d-%d\n', ...
0756                     self.primary.name, spec1, year, month, day);
0757             end
0758             ngrans = size(grans, 1);
0759             
0760             % pre-allocate; I know 'result's right size, but do not know
0761             % the right size for additional_results because I don't know
0762             % the sizes of the involved fields
0763             result = zeros(0, length(fieldnames(self.cols)), self.mattype);
0764             additional_results = cell(size(additionals));
0765             
0766             anysuccess = false;
0767             also = struct();
0768             
0769             for i = 1:ngrans
0770                 % keep track, because first granule is probably yesterday
0771                 thisyear = grans(i, 1);
0772                 thismonth = grans(i, 2);
0773                 thisday = grans(i, 3);
0774                 hour = grans(i, 4);
0775                 minute = grans(i, 5);
0776                 
0777                 logtext(fid, 'Collocating %s ''%s'' %04d-%02d-%02d %02d:%02d with %s %s \n', ...
0778                     self.primary.name, spec1, thisyear, thismonth, thisday, hour, minute, self.secondary.name, spec2);
0779                 
0780                 if ~isequal([thisyear thismonth thisday], [year month day]);
0781                     % only take collocations happening in part of granule occuring on
0782                     % the day requested
0783                     oneday = 2;
0784                 else % take any collocations happening on the day requested
0785                     oneday = 1;
0786                 end
0787                 
0788                 try
0789                     [result_granule, result_granule_addis, new_also] = self.collocate_granule(...
0790                         [thisyear, thismonth, thisday, hour, minute], spec1, spec2, additionals, oneday);
0791                     if ~isempty(fieldnames(also)) && ~isequal(also, new_also)
0792                         warning(['atmlab:' mfilename ':inconsistent'], ...
0793                             ['Additional information structures not consistent within the day. ' ...
0794                              'Expected: ' struct2string_compact(also) ...
0795                              ' Got: ' struct2string_compact(new_also)]);
0796                     end
0797                     also = new_also;
0798                     anysuccess = true;
0799                 catch ME
0800                     switch ME.identifier
0801                         case {'atmlab:find_datafile_by_date', 'atmlab:find_granule_by_datetime'}
0802                             logtext(atmlab('ERR'), 'Error in searching for datafile %4d-%02d-%02d %02d:%02d %s %s: %s. SKIPPING\n', ...
0803                                 thisyear, thismonth, thisday, hour, minute, self.primary.name, spec1, ME.message);
0804                             continue
0805                         case {'atmlab:collocate', 'atmlab:atovs_get_l1c:zamsu2l1c', 'atmlab:CollocatedDataset:noother', 'atmlab:collocate_granule:noother'}
0806                             logtext(atmlab('ERR'), 'Error in collocating with datafile at %4d-%02d-%02d %02d:%02d %s %s: %s. SKIPPING\n', ...
0807                                 thisyear, thismonth, thisday, hour, minute, self.primary.name, spec1, ME.message);
0808                             continue
0809                         case {'MATLAB:hdfinfo:invalidFile', 'MATLAB:imagesci:hdfinfo:invalidFile', 'MATLAB:imagesci:validate:fileOpen'}
0810                             logtext(atmlab('ERR'), 'Cannot read datafile %s %s %4d-%02d-%02d %02d:%02d: %s. SKIPPING\n', ...
0811                                 self.primary.name, spec1, thisyear, thismonth, thisday, hour, minute, ME.message);
0812                             continue
0813                         otherwise
0814                             ME.rethrow();
0815                     end
0816                 end
0817                 old_result = result;
0818                 result = [result; result_granule]; %#ok<AGROW>
0819                             
0820                 % special case; INDEX should be throughout entire date
0821                 % need to redo this AFTER concatenation but BEFORE further
0822                 % processing
0823                 if ~isempty(result)
0824                     result(:, self.cols.INDEX) = 1:size(result, 1);
0825                 end
0826                 
0827                 for j = 1:length(additional_results)
0828                     % use special method for concatenation, because some
0829                     % additional datasets, such as Collapser, need to
0830                     % correct certain columns, such as FIRST and LAST, upon
0831                     % concatenation
0832                     additional_results{j} = additionals{j}.concatenate(old_result, additional_results{j}, result_granule_addis{j});
0833 %                     if isempty(additional_results{j})
0834 %                         additional_results{j} = result_granule_addis{j};
0835 %                     else
0836 %                         additional_results{j} = [additional_results{j}; result_granule_addis{j}];
0837 %                     end
0838                 end
0839             end
0840             
0841             if ~anysuccess
0842                 error('atmlab:collocate_date:nosource', 'no source data found at all');
0843             end
0844             
0845             % special case; INDEX should be throughout entire date
0846             if ~isempty(result)
0847                 result(:, self.cols.INDEX) = 1:size(result, 1);
0848             end
0849             
0850         end
0851         
0852         function collocate_and_store_date(self, date, spec1, spec2, varargin)
0853             % collocate_and_store_date Collect collocations and store appropiately
0854             %
0855             % For a given date, check whether a collocation datafile exists.
0856             % If it doesn't (or <a href="matlab:help colloc_config">colloc_config</a>('overwrite') is set), collocate the indicated
0857             % satellites and sensors with each other and store the result in the
0858             % appropiate datafile.
0859             %
0860             % This only works on a single day; for long periods of time,use
0861             % <a href="matlab:help CollocatedDataset/collocate_and_store_date_range">CollocatedDataset.collocate_and_store_date_range</a>.
0862             %
0863             % FORMAT
0864             %
0865             %   collocate_and_store_date(date, spec1, spec2, additionals)
0866             %
0867             % IN
0868             %
0869             %   Input arguments as for <a href="matlab:help CollocatedDataset/collocate_date">CollocatedDataset.collocate_date</a>.
0870             %   However, if the last argument is a structure, this is not
0871             %   passed on to collocate_date, but instead interpreted as
0872             %   instructions to collocate_and_store_date.  Currently, a
0873             %   single flag is recognised:
0874             %
0875             %       autofix     When reading b0rked data, try to fix it by
0876             %                   recollocating
0877             %
0878             % OUT
0879             %
0880             %   none (but writes file)
0881             %
0882             % EXAMPLE
0883             %
0884             % (example assumes fc is a <a href="matlab:help FieldCopier">FieldCopier</a>)
0885             % >> mhscpr.collocate_date([2010 9 8], 'noaa18', '', {fc})
0886             
0887             default_flags = struct('autofix', false);
0888             if ~isempty(varargin) && isa(varargin{end}, 'struct')
0889                 flags = optargs_struct(varargin{end}, default_flags);
0890                 varargin = varargin(1:end-1);
0891             else
0892                 flags = default_flags;
0893             end
0894             
0895             if self.overwrite==2
0896                 error(['atmlab:' mfilename ':invalid'], ...
0897                     'You set %s %s.overwrite=2.  This is meaningless for core datasets.', class(self), self.name);
0898             end
0899             
0900             additionals = optargs(varargin, {{}});
0901             self.verify_addis(additionals);
0902             % get filename
0903             
0904             if isempty(spec1)
0905                 spec = spec2;
0906             elseif isempty(spec2)
0907                 spec = spec1;
0908             else
0909                 spec = {spec1, spec2};
0910             end
0911             
0912             fn = self.find_granule_by_datetime(date, spec);
0913             mainfileexists = exist(fn, 'file');
0914             if ~mainfileexists || self.overwrite == 1
0915                 % assumption:
0916                 % if the main file needs reprocessing, so do all
0917                 % additionals
0918                 
0919                 do_all_main = true;
0920                 % therefore, set those 'overwrite' attributes temporarily
0921                 % to 1
0922                 clobjs = cell(size(additionals));
0923                 owst = zeros(size(additionals));
0924                 for i = 1:length(additionals)
0925                     owst(i) = additionals{i}.overwrite;
0926                     clobjs{i} = onCleanup(@()setfield(additionals{i}, 'overwrite', owst(i)));
0927                     additionals{i}.overwrite = 1;
0928                 end
0929             else
0930                 do_all_main = false;
0931             end
0932 
0933             addisexist = false(size(additionals));
0934             addisvalid = true(size(additionals));
0935             addisread = repmat({{}}, size(additionals));
0936             addisprocess = repmat({{}}, size(additionals));
0937             addishas = repmat({{}}, size(additionals));
0938             for i = 1:length(additionals)
0939                 fna = additionals{i}.find_granule_by_datetime(date, spec);
0940                 addisexist(i) = exist(fna, 'file');
0941                 if addisexist(i)
0942                     switch additionals{i}.overwrite
0943                         case 0
0944                             addisread{i} = {};
0945                             addisprocess{i} = {};
0946                         case 1
0947                             addisread{i} = {};
0948                             addisprocess{i} = fieldnames(additionals{i}.members);
0949                         case 2
0950                             addisread{i} = {};
0951                             [addishas{i}, globattr] = quickly_read_gzipped_netcdf_header(additionals{i}.find_granule_by_datetime(date, spec));
0952                             if additionals{i}.redo_all(globattr.software_version)
0953                                 addisprocess{i} = fieldnames(additionals{i}.members);
0954                                 addisvalid(i) = false;
0955                             else
0956                                 addisprocess{i} = setdiff(...
0957                                     fieldnames(additionals{i}.members), ...
0958                                     addishas{i});
0959                             end
0960                         otherwise
0961                             addisread{i} = 'all';
0962                             if iscell(additionals{i}.overwrite)
0963                                 addisprocess{i} = additionals{i}.overwrite;
0964                             else
0965                                 error(['atmlab:' mfilename ':invalid'], ...
0966                                     '%s %s has invalid value for overwrite', ...
0967                                     class(self), self.name);
0968                             end
0969                     end
0970                 else
0971                     addisread{i} = {};
0972                     addisprocess{i} = fieldnames(additionals{i}.members);
0973                 end
0974             end
0975             [addisread, addisprocess] = self.meet_dependencies(additionals, addisread, addisprocess, addishas);
0976             
0977             % if everything is already there, do nothing
0978             if mainfileexists && all(cellfun(@isempty, addisprocess))
0979                 logtext(atmlab('OUT'), 'All output files already exist and contain the requested fields.  None are to be processed.\n');
0980                 logtext(atmlab('OUT'), 'I found the following files:\n');
0981                 logtext(atmlab('OUT'), '    %s\n', fn);
0982                 for i = 1:length(additionals)
0983                     logtext(atmlab('OUT'), '    %s\n', additionals{i}.find_granule_by_datetime(date, spec));
0984                 end
0985                 return
0986             end
0987 
0988             % so SOME work is to be done, at least
0989 
0990             info_addi = struct();
0991             info_addi.history = [datestr(now, 'YYYY-mm-dd') ' Obtained additionals from collocs'];
0992             info_addi.parent_dataset = self.name;
0993             if iscell(spec)
0994                 info_addi.parent_spec = [spec{:}];
0995             else
0996                 info_addi.parent_spec = spec;
0997             end
0998             
0999             if mainfileexists && ~self.overwrite
1000                 logtext(atmlab('OUT'), 'Collocations exist, but additionals incomplete or to be overwritten or complemented\n');
1001                 logtext(atmlab('OUT'), 'I will reprocess some additionals\n');
1002                 [result, ~, attr] = self.read_single_day(date, spec, fieldnames(self.cols));
1003                 if isempty(result)
1004                     logtext(atmlab('OUT'), 'Upon reading collocations, appears there are none, nothing to be done for today\n');
1005                     return
1006                 end
1007                 info_addi.parent_id = attr.id;
1008                 
1009 %                 if ~isempty(self.log)
1010 %                     logtext(self.log, '%s %s %s Collocations present, regenerate additionals\n', mat2str(date), spec1, spec2);
1011 %                 end
1012                 addi = cell(size(additionals));
1013                 cc = repmat({struct()}, size(additionals));
1014                 for i = 1:length(additionals)
1015                     % step 1: add fields that must be processed
1016                     % step 2: read fields needed for later additionals
1017                     % step 3: possibly try to fix if broken
1018                     if ~isempty(addisprocess{i})
1019                         [addi, cc] = self.fill_addi(result, date, spec, spec1, spec2, additionals, i, addi, info_addi, cc, addisprocess, addisvalid);
1020                         addisprocess{i} = cell(0, 1);
1021                         addisvalid(i) = true;
1022                     end
1023                     
1024                     if ~isempty(addisread{i})
1025                         logtext(atmlab('OUT'), 'Reading additionals for %s %s: %s\n', class(additionals{i}), additionals{i}.name, strjoin(vec2row(addisread{i}), ', '));
1026                         try
1027                             [addi{i}, cc{i}] = additionals{i}.read_single_day(date, spec, addisread{i});
1028                             addisvalid(i) = true;
1029                         catch ME
1030                             switch ME.identifier
1031                                 case {'atmlab:HomemadeDataset:invalid'}
1032                                     if flags.autofix
1033                                         addisvalid(i) = false;
1034                                         logtext(atmlab('OUT'), ...
1035                                             'Encountered problem: %s.  I''l try to fix that.', ...
1036                                             ME.message);
1037                                     else
1038                                         ME.rethrow();
1039                                     end
1040                                 otherwise
1041                                     ME.rethrow();
1042                             end
1043                         end
1044                                 
1045 %                         if addisvalid(i) && isempty(fieldnames(additionals{i}.cols))
1046 %                             additionals{i}.cols = cc;
1047 %                         end
1048                     else
1049                         addi{i} = [];
1050                         cc{i} = struct();
1051                     end
1052                     
1053                     if ~addisvalid(i)
1054                         logtext(atmlab(out), 'Redoing %s %s completely!', class(additionals{i}), additionals{i}.name);
1055                         [addi, cc] = self.fill_addi(result, date, spec, spec1, spec2, additionals, i, addi, info_addi, cc, fieldnames(additionals{i}.members), addisvalid);
1056 %                         logtext(atmlab('OUT'), 'Collecting additionals for %s %s\n', class(additionals{i}), additionals{i}.name);
1057 %                         [depdata, depcols] = self.fix_dependencies(additionals{i}, additionals, addi, cc);
1058 %
1059 %                         %logtext(atmlab('OUT'), 'Fields to process: %s\n', strjoin(vec2row(addisprocess{i}), ', '));
1060 %                         logtext(atmlab('OUT'), 'Fields to process: %s\n', strjoin(addisprocess{i}(:)', ', ')); % much quicker and to avoid "Warning: Cannot convert zero-sized vector to row "
1061 %                         [addii, cci] = additionals{i}.process_delayed(result, spec1, spec2, depdata, cc, addisprocess{i});
1062 %                         additionals{i}.store(date, spec, addii, info_addi, cci);
1063 %                         if isempty(addi{i})
1064 %                             addi{i} = addii;
1065 %                             cc{i} = cci;
1066 %                         else
1067 %                             addi{i} = [addi{i}, addii];
1068 %                             cc{i} = catstruct(cc{i}, structfun(@(X)X+max(cellfun(@(X)X, struct2cell(cc{i}))), cci, 'UniformOutput', false));
1069 %                         end
1070                     end
1071                 end
1072 
1073             else
1074                 % So, we redo everything then
1075                 
1076 %                 if ~isempty(self.log)
1077 %                     logtext(self.log, '%s %s %s Generating collocations\n', mat2str(date), spec1, spec2);
1078 %                 end
1079                 try
1080                     [result, addis, also] = ...
1081                         self.collocate_date(date, spec1, spec2, additionals);
1082                 catch ME
1083                     switch ME.identifier
1084                         case {'atmlab:collocate_date:nosource', ...
1085                                 'atmlab:CollocatedDataset:noother'}
1086                             logtext(atmlab('ERR'), ...
1087                                 'No succesful collocations at %04d-%02d-%02d, not writing\n', ...
1088                                 date(1), date(2), date(3));
1089                             return;
1090                         otherwise
1091                             ME.rethrow();
1092                     end
1093                 end
1094                 
1095                 % additional attributes, main part
1096                 
1097                 info_core = struct();
1098                 info_core.history = [datestr(now, 'YYYY-mm-dd') ' Collocations generated from scratch'];
1099                 info_core.maxdist_km = self.distance;
1100                 info_core.maxtime_s = self.interval;
1101                 info_core.primary_dataset = self.primary.name;
1102                 info_core.primary_info = spec1;
1103                 info_core.primary_version = also.version{1};
1104                 info_core.secondary_dataset = self.secondary.name;
1105                 info_core.secondary_sensor = spec2;
1106                 info_core.secondary_version = also.version{2};
1107                 info_core.start_time = double(date2unixsecs(date(1), date(2), date(3)));
1108 
1109                 logtext(atmlab('OUT'), 'Storing %d collocations\n', size(result, 1));
1110                 [~, attr] = self.store(date, spec, result, info_core);
1111                 
1112 %                 if ~isempty(additionals) && ~isempty(self.log)
1113 %                         logtext(self.log, '%s Generating additionals\n', mat2str(date));
1114 %                 end
1115                 info_addi.parent_id = attr.id;
1116 
1117                 if do_all_main
1118                     for i = 1:length(additionals)
1119                         additionals{i}.store(date, spec, addis{i}, info_addi);
1120                         additionals{i}.overwrite = owst(i);
1121                     end
1122 
1123                 end
1124             end
1125         end
1126         
1127         function collocate_and_store_date_range(self, date_start, date_end, spec1, spec2, varargin)
1128             % Collocate and store date range
1129             %
1130             % Loop through a range of dates and for each date,
1131             % <a href="matlab:help CollocatedDataset/collocate_and_store_date">collocate and store</a> collocations for each date.
1132             %
1133             % This is the most suitable method for collocating large
1134             % quantities of data.
1135             %
1136             % FORMAT
1137             %
1138             %   col.collocate_and_store_date_range(date_start, date_end, spec1, spec2, varargin)
1139             %
1140             % IN
1141             %
1142             %   date_start      datevec     starting date
1143             %   date_end        datevec     ending date
1144             %   spec1           various     sat or sat pair or empty
1145             %   spec2           various     sat or sat pair or empty
1146             %   additionals     cell-array  (optional) <a href="matlab:help AssociatedDataset">AssociatedDataset</a>s, if any
1147             %   flags           struct      (optional) flags as for <a
1148             %   href="matlab:help
1149             %   CollocatedDataset/collocate_and_store_date">collocate_and_store_date</a>.
1150             %
1151             % OUT
1152             %
1153             %   No output; the volume of data may be very large, so the
1154             %   only 'output' is written to disc.
1155             %
1156             % EXAMPLE
1157             %
1158             % (example assumes fc is a <a href="matlab:help FieldCopier">FieldCopier</a>)
1159             %   mhscpr.collocate_and_store_date_range([2010 9 1], ...
1160             %       [2010 9 30], '', 'noaa18', {fc});
1161             
1162             try
1163                 self.log = fileopen(fullfile(self.basedir, self.logfile), 'a');
1164             catch ME
1165                 switch ME.identifier
1166                     case 'atmlab:fileopen:IOError'
1167                         logtext(atmlab('ERR'), ...
1168                             'Unable to append to logfile at %s. Error message: %s. Not writing logfile\n', ...
1169                             fullfile(self.basedir, self.logfile), ...
1170                             ME.message);
1171                         self.log = fileopen('/dev/null', 'w');
1172                 end
1173             end
1174             c = onCleanup(@self.cleanup_log);
1175             logtext(self.log, self.marker);
1176             logtext(self.log, 'Starting collocations: %s %s %s vs. %s %s from %s to %s\n', ...
1177                 atmlab_version(), self.primary.name, spec1, self.secondary.name, spec2, ...
1178                 mat2str(date_start), mat2str(date_end));
1179             logtext(atmlab('OUT'), 'Starting collocations\n');
1180             logtext(atmlab('OUT'), '%s %s vs. %s %s\n', self.primary.name, spec1, self.secondary.name, spec2);
1181             logtext(atmlab('OUT'), 'From %s to %s\n', mat2str(date_start), mat2str(date_end));
1182             alldates = daterange(date_start, date_end);
1183             
1184             i = 1;
1185             while i <= size(alldates, 1)
1186                 year = alldates(i, 1);
1187                 month = alldates(i, 2);
1188                 day = alldates(i, 3);
1189                 logtext(atmlab('OUT'), 'collocating %04d-%02d-%02d\n', ...
1190                     year, month, day);
1191                 try
1192                     self.collocate_and_store_date([year, month, day], spec1, spec2, varargin{:});
1193                     i = i + 1;
1194                 catch ME
1195                     [~] = ME.getReport();
1196                     switch ME.identifier
1197                         case {'atmlab:SatDataset:missing_firstline'}
1198                             % run firstline-thingies
1199                             if strfind(ME.message, self.primary.name)
1200                                 again = self.primary;
1201                                 spec = spec1;
1202                             elseif strfind(ME.message, self.secondary.name)
1203                                 again = self.secondary;
1204                                 spec = spec2;
1205                             else
1206                                 error(['atmlab:' mfilename], ...
1207                                     ['I received a missing_firstline-error message but ' ...
1208                                      'I don''t know what dataset it is about. This is a bug. ' ...
1209                                      'The message was: ' ME.message]);
1210                             end
1211                             logtext(atmlab('ERR'), 'Warning: I don''t have firstline-data for today for %s. I''ll try to get some! (todays collocation will be redone)\n', again.name);
1212                             again.find_granule_first_line(...
1213                                 par(datevec(datenum(alldates(1, :))-1), 1:3), ...
1214                                 par(datevec(datenum(alldates(end, :))+1), 1:3), ...
1215                                 spec);
1216                             again.granule_first_line(0, spec, true, true); % force reload
1217                         case {'atmlab:AssociatedDataset:cannolongerread'}
1218                             logtext(atmlab('ERR'), ...
1219                                 'Oddly, I can no longer read a file I could read in the past.  The full problem is:\n');
1220                             if isempty(ME.cause{1}.cause)
1221                                 rep1 = [];
1222                             else
1223                                 rep1 = ME.cause{1}.cause{1}.getReport();
1224                             end
1225                             rep2 = ME.cause{1}.getReport();
1226                             rep3 = ME.getReport();
1227                             if ~isequal(rep1, [])
1228                                 fprintf(atmlab('ERR'), rep1);
1229                             end
1230                             fprintf(atmlab('ERR'), rep2);
1231                             fprintf(atmlab('ERR'), rep3);
1232                             logtext(atmlab('ERR'), ...
1233                                 'I''ll redo the entire day completely...\n');
1234                             ovr = self.overwrite;
1235                             clo = onCleanup(@()setfield(self, 'overwrite', ovr));
1236                             self.overwrite = true;
1237                             self.collocate_and_store_date([year, month, day], spec1, spec2, varargin{:});
1238                             i = i + 1;
1239                             self.overwrite = ovr;
1240                             
1241                         otherwise
1242                             ME.rethrow();
1243                     end
1244                 end
1245             end
1246             logtext(self.log, 'Collocations finished. All seems fine.\n');
1247             logtext(atmlab('OUT'), 'Finished!\n');
1248             self.success = true;
1249             %%%
1250         end
1251         
1252         function varargout = read(self, date_start, date_end, spec, all_fields, varargin)
1253             % Read collocations for indicated period
1254             %
1255             % This method reads all or a selection of the collocations for
1256             % the indicated period, between date_start and date_end.
1257             % If less than two output arguments are requested, additionals will
1258             % be merged with the core.  Note that this means that only a
1259             % fraction of the stored 'core' information will be returned.
1260             % From each set of secondaries sharing the same primary, only
1261             % one piece of information will be returned.  It is not defined
1262             % to what core collocation this belongs.  Therefore, it is not
1263             % meaningful to both merge, *and* obtain fields from the
1264             % secondary.
1265             % Merging is required in the presence of limitations.
1266             %
1267             % FORMAT
1268             %
1269             % [M, M_cols, addis, addis_cols, additionals] = ...
1270             %       col.read(date_start, date_end, spec, fields[, limits[,
1271             %       filters[, associated_datasets]]]);
1272             %
1273             % IN
1274             %
1275             %   date_start  datevec     First date to read
1276             %   date_end    datevec     Last date to read (inclusive)
1277             %   spec        various     Specification. Details depend
1278             %                           on dataset, but usually a
1279             %                           string with a satellite or a
1280             %                           cell array of strings with two
1281             %                           satellites.
1282             %   fields      cell-str    Cell array of strings with all
1283             %                           the fields to be read. Can be
1284             %                           from the core-dataset or from
1285             %                           the additional datasets. It must
1286             %                           contain at least some fields from
1287             %                           the core dataset. See <a href="matlab:help CollocatedDataset/list_fields">list_fields</a> for a full list.
1288             %
1289             %   limits      structure   (optional)
1290             %                           Structure describing acceptable
1291             %                           ranges for zero or more fields.
1292             %                           E.g. struct('LAT1', [-30 30])
1293             %                           will limit collocations to
1294             %                           those where the value of
1295             %                           'LAT1', is between -30 and +30,
1296             %                           inclusive on both ends.
1297             %                           NOTE: This (currently) works only
1298             %                           if you let the output data be
1299             %                           merged (e.g. 2 output args).
1300             %   filters     cell-array  (optional)
1301             %                           Cell array of cell arrays:
1302             %                           {filter1, filter2, ...}. Each
1303             %                           filter1 is itself a cell-array
1304             %                           with 2 or 3 elements:
1305             %                           {handle, fieldnames, extra},
1306             %                           where handle is a
1307             %                           function-handle, fieldnames a
1308             %                           cell array of strings with
1309             %                           fields whose values will be
1310             %                           passed on to the filter, and
1311             %                           extra a cell array of arbitrary
1312             %                           values that will be passed on
1313             %                           to the handle as-is.
1314             %                           NOTE: This (currently) works only
1315             %                           if you let the output data be
1316             %                           merged (e.g. 2 output args).
1317             %
1318             %   ads     cell array of AssociatedDatasets
1319             %
1320             %       Limit searching of fields in AssociatedDatasets to this
1321             %       cell-array of AssociatedDatasets. This is necessary if
1322             %       multiple AssociatedDatasets contain the same fields,
1323             %       and it is thus ambiguous from what AssociatedDataset a
1324             %       field is to be read.
1325             %
1326             % OUT
1327             %
1328             %   M       matrix      with values corresponding to fields stored
1329             %                       in core collocations
1330             %
1331             %   M_cols  structure   describing what field ended up in what
1332             %                       column of M
1333             %
1334             % More than 2 output arguments will change the behaviour; if
1335             % you have at most 2 output arguments, the result will be
1336             % merged; i.e. only one piece of information per primary is
1337             % returned, yielding information from the secondary potentially
1338             % meaningless.  With more than 2 output arguments, the result
1339             % will be separate for each AdditionalDataset, and limitations
1340             % will not work.
1341             %
1342             %   addis   cell-array  Cell array of matrices similar to 'M',
1343             %                       one for each additional dataset for
1344             %                       which at least one field was found
1345             %
1346             %   addis_cols structure Like M_cols, corresponding to each of
1347             %                       addis
1348             %
1349             %   associated cell-array of AssociatedData containing the
1350             %                       AssociatedData objects for which at
1351             %                       least one field ended up in addis.
1352             %
1353             % TODO:
1354             %  - read back from original data. Currently,
1355             %  col.collocate_and_store_date can already take care of this, if
1356             %  collocations exist.
1357             %  - add check that associated-datasets match
1358             %  - when merging, make sure required fields such as FIRST and
1359             %  LAST are present
1360             %
1361             % EXAMPLE
1362             %
1363             % [M, c, MM, cc, aa] = ...
1364             %       col.read([2007 1 1],[2007 1 10], 'n18', ...
1365             %                {'LAT1', 'LON1', 'LAT2', 'LON2', 'RO_ice_water_path', 'cld_reff_vis','cld_opd_vis'}, ...
1366             %                 struct('LAT1', [-30 30]), ...
1367             %                 {{@(x, y) x>y, {'LAT1', 'LON1'}}});
1368             %
1369             %
1370             
1371             narginchk(5, 8);
1372             
1373             %% prepare configuration things
1374             
1375             [limits, filters_by_name, ads] = optargs(varargin, {struct(), {}, self.associated});
1376                         
1377             rqre_datatype(date_start, @isvector);
1378             rqre_datatype(date_end, @isvector);
1379             rqre_datatype(spec, {@ischar, @iscellstr});
1380             rqre_datatype(limits, @isstruct);
1381             rqre_datatype(filters_by_name, @iscell);
1382             rqre_datatype(ads, @iscell);
1383             
1384             % try to read from cache
1385             if ~isempty(self.pcd)
1386                 cachestr = self.calc_cache_key(date_start, date_end, spec, all_fields, limits, filters_by_name, ads, nargout);
1387                 if self.pcd.has_entry(cachestr)
1388                     logtext(atmlab('OUT'), 'Reading from cache entry %s %s.%s\n', ...
1389                         class(self), self.name, cachestr)
1390                     varargout = self.pcd.get_entry(cachestr);
1391                     return
1392                 else
1393                     logtext(atmlab('OUT'), 'No cache entry found, starting reading\n');
1394                 end
1395             else
1396                 logtext(atmlab('OUT'), 'No cache object defined, starting reading\n');
1397             end
1398             
1399             % distribute fields, limits, etc. over core, additionals
1400             [fields_core, additionals, additional_fields, ~] = self.deal_fields(all_fields, ads);
1401 
1402             additionals_day = cell(size(additional_fields));
1403             addis_cols = cell(size(additional_fields));
1404             addis_cols_here = cell(size(additional_fields));
1405             addis = cell(size(additional_fields));
1406             for i = 1:length(addis)
1407                 addis{i} = struct();
1408             end
1409             
1410             merge = false;
1411             if nargout<3 && ~isempty(additionals)
1412                 merge = true;
1413                 logtext(atmlab('OUT'), ...
1414                     'CollocatedDataset.read was called with %d (<3) output arguments, additionals will be merged into the first 2 arguments, possibly reducing their size!\n', ...
1415                     nargout);
1416             
1417                 % if merging, check that required merging fields are all
1418                 % there
1419                 for j = 1:length(additionals)
1420                     required_fields = additionals{j}.get_mergefields();
1421                     given_fields = additional_fields{j};
1422                     for k = 1:length(required_fields)
1423                         required_field = required_fields{k};
1424                         if ~any(strcmp(given_fields, required_field))
1425                             logtext(atmlab('OUT'), ...
1426                                 'Additional dataset %s requires field %s, but not given. Will add unsolicited!\n', ...
1427                                 additionals{j}.name, required_field);
1428                             additional_fields{j} = [additional_fields{j} required_field];
1429                         end
1430                     end
1431                 end
1432                 
1433             elseif nargout>=3 % more than 2 output arguments
1434                 if ~isempty(fieldnames(limits)) || ~isempty(filters_by_name)
1435                     error(['atmlab:' mfilename ':NotImplemented'], ...
1436                         ['Applying limits or filters is not supported when ' ...
1437                          'additionals are seperately output. Try running ' ...
1438                          'with less than 3 output arguments (found %d) or ' ...
1439                          'do your own limitations aftewards. Sorry!'], ...
1440                          nargout);
1441                 end
1442             end
1443             % Not filtering duplicates, this should be done when processing
1444             % collocations initially, just before storing
1445 
1446             M = [];
1447             
1448             %% loop through all the dates
1449             
1450             dates = daterange(date_start, date_end);
1451             M_cols = struct();
1452             M_cols_merged = struct();
1453             filters_by_no = cell(size(filters_by_name));
1454             hitno = 0; % count no. of days with collocs
1455             for i = 1:size(dates, 1)
1456                 date = dates(i, :);
1457                 %% read collocations for date
1458                                                 
1459                 try
1460                     [collocations_day, M_cols_core_here, attr_core] = self.read_single_day(date, spec, fields_core);
1461 
1462                     if numel(collocations_day)==0
1463                         if length(fields_core) == 0
1464                             % ugly hack, will secretly read core ANYWAY
1465                             % because I need to tell if there are
1466                             % collocations
1467                             collocations_day = self.read_single_day(date, spec, {'LAT1'});
1468                             if numel(collocations_day) == 0
1469                                 logtext(atmlab('OUT'), 'really no collocations\n');
1470                                 continue
1471                             else
1472                                 logtext(atmlab('OUT'), 'no fields asked from core\n');
1473                             end
1474                         else
1475                             logtext(atmlab('OUT'), 'no collocations\n');
1476                             continue
1477                         end
1478                     end
1479                     M_cols_core = M_cols_core_here;
1480                     %hitno = hitno + 1;
1481                     % also read for all additionals
1482                     for j = 1:length(additionals)
1483                         [additionals_day{j}, addis_cols_here{j}, attr_addi] = additionals{j}.read_single_day(date, spec, additional_fields{j});
1484                         % verify that AssociatedDataset was generated for
1485                         % the same CollocatedDataset
1486                         self.verify_addi_granule_consistency(attr_core, attr_addi, additionals{j}, date);
1487                         if numel(additionals_day{j})>0
1488                             addis_cols{j} = addis_cols_here{j};
1489                         end
1490                     end
1491                     stoplater = false;
1492                     if merge
1493                         M_cols_merged_here = M_cols_core_here;
1494                         for j = 1:length(additionals)
1495                             [collocations_day, M_cols_merged_here] = additionals{j}.merge_matrix(collocations_day, M_cols_merged_here, additionals_day{j}, addis_cols_here{j});
1496                         end
1497                         %M_cols = M_cols_here;
1498                         if numel(collocations_day)==0
1499                             logtext(atmlab('OUT'), 'At least one of the additionals had 0 elements. Upon merging, no collocations left --- continuing with next day\n');
1500                             stoplater = true;
1501                         else
1502                             M_cols_merged = M_cols_merged_here;
1503                         end
1504                         M_cols = M_cols_merged;
1505                     else
1506                         M_cols = M_cols_core;
1507                     end
1508                 catch ME
1509                     switch (ME.identifier)
1510                         case {'MATLAB:load:couldNotReadFile', ...
1511                                 'MATLAB:nonExistentField', ...
1512                                 'MATLAB:gunzip:invalidFilename',...
1513                                 'MATLAB:netcdf:open:noSuchFile', ...
1514                                 'atmlab:exec_system_cmd:shell', ...
1515                                 'atmlab:find_granule_by_datetime'}
1516                             logtext(atmlab('ERR'), 'Problem for %04d-%02d-%02d: %s\n', ...
1517                                 date(1), date(2), date(3), ME.message);
1518                             continue
1519                         otherwise
1520                             ME.rethrow();
1521                     end
1522                 end
1523                 
1524                 if stoplater
1525                     continue
1526                 else
1527                     hitno = hitno + 1;
1528                 end
1529                 
1530                 %% apply limitations
1531                 if hitno == 1
1532                     % convert limits-structure to limits-matrix. This is
1533                     % done only after the first time I find collocations,
1534                     % because only then I know for sure what sizes the
1535                     % fields have.
1536                     limmat = limstruct2limmat(limits, M_cols);
1537                     for k = 1:length(filters_by_name)
1538                         filters_by_no{k} = {...
1539                             filters_by_name{k}{1}, ...
1540                             cellfun(@(s) M_cols.(s)', filters_by_name{k}{2}, 'UniformOutput', false), ...
1541                             filters_by_name{k}{3:end}};
1542                     end
1543                     
1544                 end
1545                 
1546                 lim = collocation_restrain(collocations_day, limmat, filters_by_no);
1547                 collocations_day = collocations_day(lim, :);
1548                 
1549                 %% add to total
1550                 if isempty(M) % should mean additionals are empty, too
1551                     M = collocations_day;
1552                     addis = additionals_day;
1553                 else
1554                     L = size(M, 1);
1555                     N = size(collocations_day, 1);
1556                     M((L+1):(L+N), :) = collocations_day;
1557                     logtext(atmlab('OUT'), '%d + %d = %d collocations so far\n', L, N, L+N);
1558                     for j = 1:length(additional_fields)
1559                         N_a = size(additionals_day{j}, 1);
1560                         if ~isempty(additionals_day{j})
1561                             addis{j}((L+1):(L+N_a), :) = additionals_day{j};
1562                         end
1563                     end
1564                 end
1565             end
1566             if hitno==0
1567                 warning(['atmlab:' mfilename], ...
1568                     'No collocations found at all. Do not trust column descriptions.');
1569             end
1570             
1571             varargout{1} = M;
1572             if nargout > 1
1573                 varargout{2} = M_cols;
1574                 if nargout > 2
1575                     varargout(3:5) = {addis, addis_cols, additionals};
1576                 end
1577             end
1578             
1579             % possibly cache result
1580             if ~isempty(self.pcd)
1581                 logtext(atmlab('OUT'), 'Storing result in cache\n');
1582                 try
1583                     self.pcd.set_entry(cachestr, varargout, ...
1584                         {self.name, date_start, date_end, spec, all_fields, limits, evalc('disp(filters_by_name)')}); % OUCH! :(
1585                 catch ME
1586                     switch ME.identifier
1587                         case 'atmlab:PersistentCachedData:noSpace'
1588                             logtext(atmlab('ERR'), 'Not storing, no space: %s', ME.message);
1589                         otherwise
1590                             ME.rethrow();
1591                     end
1592                 end
1593             end
1594             
1595             %%%
1596         end
1597         
1598         function varargout = list_fields(self)
1599             % return struct with valid fields in this dataset + associated
1600             %
1601             % If called without output arguments, writes to atmlab('OUT') a
1602             % pretty-printed list of all fields that are valid to pass on
1603             % to self.read. With one output argument, return a structure
1604             % with this information. No input arguments.
1605             %
1606             % WARNING: this lists the fields as defined in the core and
1607             % associated datasets.  It is possible that some collocations
1608             % were generated with older versions of those datasets.
1609             % Therefore, it is NOT guaranteed that all those fields can be
1610             % read for all granules!
1611             
1612             S.(self.name) = list_fields@HomemadeDataset(self);
1613             for i = 1:length(self.associated)
1614                 S.(self.associated{i}.name) = self.associated{i}.list_fields();
1615             end
1616             switch nargout
1617                 case 0 % write it out nicely
1618                     mems = fieldnames(S);
1619                     for i = 1:length(mems)
1620                         fprintf(atmlab('OUT'), '%s:\n', mems{i});
1621                         fprintf(atmlab('OUT'), '    ');
1622                         for k = 1:length(S.(mems{i}))
1623                             fprintf(atmlab('OUT'), '%s ', S.(mems{i}){k});
1624                         end
1625                         fprintf(atmlab('OUT'), '\n\n');
1626                     end
1627                 case 1
1628                     varargout = {S};
1629                 otherwise
1630                     error(['atmlab:' mfilename], 'Too many output arguments');
1631             end
1632         end
1633         
1634         % low-level
1635         
1636         function collocations = collocate(self, t1, lat1, long1, t2, lat2, long2)
1637             % collocate Return collocations between matrices
1638             %
1639             % collocate searches for collocations between the measurements in (t1, lat1,
1640             % long1) and (t2, lat2, long2). Latitudes and longitudes should be in degrees,
1641             % the time can be in any unit. 'collocate' considers footprints as points and
1642             % defines a collocation according to a maximum distance ('maxdist', in km).
1643             % The maximum time to consider a collocation is in 'maxtime' and should be in
1644             % the same unit as t1 and t2.
1645             %
1646             % The maximum distance and time are defined by the properties
1647             % <a href="matlab:help CollocatedDataset/distance">distance</a> and <a href="matlab:help CollocatedDataset/interval">interval</a>.
1648             %
1649             % This is a low-level method that is not usually called
1650             % directly. Rather call <a href="matlab:help CollocatedDataset/collocate_granule">collocate_granule</a> or
1651             % <a href="matlab:help CollocatedDataset/collocate_and_store_date_range">collocate_and_store_date_range</a> or so.
1652             %
1653             % FORMAT collocations = cd.collocate(t1, lat1, long1, t2, lat2, long2)
1654             %
1655             % OUT M       Nx4 matrix. Each row corresponds to a collocation, and has the
1656             %             rows and columns of the 1st and 2nd dataset respectively.
1657             %
1658             % IN  t1      Array length L1 giving the times of the scanlines.
1659             % IN  lat1    Matrix L1xW1 with latitude for L1 scanlines with W1 scans per
1660             %             line. The latitude should be in degrees.
1661             % IN  long1   Matrix L1xW1 with longitude (-180, 180), L1 scans, W1 scans/line.
1662             % IN  t2      Array of length L2 giving the times of the scanlines.
1663             % IN  lat2    Matrix L2xW2 with latitude, L2 scans, W2 scans/line.
1664             % IN  long2   Matrix L2xW2 with longitude (-180, 180), L2 scans, W2 scans/line.
1665             %
1666             % Hint: if the collocations are slow, try tweaking <a href="matlab:help
1667             % CollocatedDataset/gridsize">gridsize</a>.
1668 
1669             %% data checks
1670 
1671             nrows1 = size(lat1, 1);
1672             ncols1 = size(lat1, 2);
1673             nrows2 = size(lat2, 1);
1674             ncols2 = size(lat2, 2);
1675             nel1 = numel(lat1);
1676             nel2 = numel(lat2);
1677 
1678             errid = 'atmlab:collocate';
1679             % check dimensions
1680             assert(size(long1, 1)==nrows1, errid, 'length long1 differs from length lat1');
1681             assert(size(long2, 1)==nrows2, errid, 'length long2 differs from length long1');
1682             assert(length(t1)==nrows1, errid, 'length t1 differs from length lat1');
1683             assert(length(t2)==nrows2, errid, 'length t2 differs from length long2');
1684             assert(size(long1, 2)==ncols1, errid, 'width long1 differs from width lat1');
1685             assert(size(long2, 2)==ncols2, errid, 'width long2 differs from width lat2');
1686 
1687             % check correct numbers
1688             assert(max(abs(long1(:)))<=180, errid, 'Invalid data in long1');
1689             assert(max(abs(lat1(:)))<=90, errid, 'Invalid data in lat1');
1690             assert(max(abs(long2(:)))<=180, errid, 'Invalid data in long2');
1691             assert(max(abs(lat2(:)))<=180, errid, 'Invalid data in lat2');
1692             if length(lat1) > 1 && length(lat2) > 1
1693                 assert(max(lat1(:))~=min(lat1(:)), errid, 'lat1 constant');
1694                 assert(max(lat2(:))~=min(lat2(:)), errid, 'lat2 constant');
1695                 assert(max(long1(:))~=min(long1(:)), errid, 'lon1 constant');
1696                 assert(max(long2(:))~=min(long2(:)), errid, 'lon2 constant');
1697             end
1698             
1699             %% make t, rows, cols at the same grid --> no need for ind2sub
1700             
1701             t1f = reshape(repmat(t1, [1 ncols1]), [1 nel1]);
1702             rows1f = reshape(repmat((1:nrows1)', [1 ncols1]), [1 nel1]);
1703             cols1f = reshape(repmat(1:ncols1, [nrows1 1]), [1 nel1]);
1704 
1705             t2f = reshape(repmat(t2, [1 ncols2]), [1 nel2]);
1706             rows2f = reshape(repmat((1:nrows2)', [1 ncols2]), [1 nel2]);
1707             cols2f = reshape(repmat(1:ncols2, [nrows2 1]), [1 nel2]);
1708 
1709             % determine earth radius
1710 
1711             earth_radius = constants('EARTH_RADIUS_MEAN')/1e3; % m -> km
1712 
1713             %% find collocations
1714 
1715             % arbitrary start for pre-alloc, but getting more if needed later
1716             collocations = zeros(numel(lat1), 4, 'uint32');
1717 
1718             % Bin the measurements into bins, where the 'data' is the index of the
1719             % measurement. We only need to consider those bins where:
1720             % - instrument 1 has any measurements in the cell
1721             % - instrument 2 has any mearurements in the cell or a nearby cell
1722             % "Nearby" cell means a neighbouring cell, except near the poles, where it
1723             % can be much further away (because the cells are equirectangular)
1724             
1725             % NB! New method as per 2013-03-04!
1726             logtext(atmlab('OUT'), 'Gridding %d + %d = %d points using %s method...\n', ...
1727                 numel(lat1), numel(lat2), numel(lat1)+numel(lat2), self.binning);
1728             if isequal(self.binning, 'classical')
1729                 [grid1, lat_grid1, ~] = binning_fast(...
1730                     struct(...
1731                     'lat', lat1(:), ...
1732                     'lon', long1(:), ...
1733                     'data', uint32(1:numel(lat1)).', ...
1734                     'gridsize', self.gridsize));
1735                 
1736                 grid2 = binning_fast(...
1737                     struct(...
1738                     'lat', lat2(:), ...
1739                     'lon', long2(:), ...
1740                     'data', uint32(1:numel(lat2)).', ...
1741                     'gridsize', self.gridsize));
1742                 
1743             elseif isequal(self.binning, 'experimental')
1744                 lat_grid1 = -90:self.gridsize:90;
1745                 lon_grid1 = -180:self.gridsize:180;
1746                 % cache result because same binning happens repeatedly if
1747                 % collocating one primary granule with many secondary
1748                 % granules
1749                 grid1 = self.cache.evaluate(1, @bin_nd, {lat1(:), long1(:)}, {lat_grid1, lon_grid1});
1750                 
1751                 lat_grid2 = lat_grid1;
1752                 lon_grid2 = lon_grid1;
1753                 grid2 = self.cache.evaluate(1, @bin_nd, {lat2(:), long2(:)}, {lat_grid2, lon_grid2});
1754                 % throw away exactly-polar values (should be extremely
1755                 % rare). FIXME: should add to last gridcell instead,
1756                 % preferably inside bin_nd, not implemented
1757                 grid1 = grid1(1:end-1, 1:end-1);
1758                 grid2 = grid2(1:end-1, 1:end-1);
1759             else
1760                 error(['atmlab:' mfilename ':unknownmethod'], ...
1761                     '%s.binning must be ''experimental'' or ''classical'', got ''%s''', ...
1762                     self.name, self.binning);
1763             end
1764             
1765             n_grid_lats = size(grid1, 1);
1766             n_grid_lons = size(grid1, 2);
1767 
1768             % Check each cell where there is data for grid1 AND at least one
1769             % neighbouring cell, or the cell itself, has data for grid2
1770 
1771             count = 0;
1772 
1773             % the width and height of the cells as a function of latitude
1774             cell_width = 2 * pi * earth_radius * cosd(lat_grid1) / 360;
1775             cell_height = 2 * pi * earth_radius / 360;
1776 
1777             % how many cells of longitude to check for particular latitude
1778             cells_in_lat_range = ceil(self.distance ./ cell_width);
1779             
1780             % very close to the poles, cells_in_lat_range may go to
1781             % infinity, but we never need more than the number of lats.
1782             cells_in_lat_range(cells_in_lat_range>n_grid_lats) = n_grid_lons;
1783             
1784             c_lon = ceil(self.distance ./ cell_height);
1785 
1786             logtext(atmlab('OUT'), 'Searching for collocations per grid-point...\n');
1787             for i = 1:size(grid1, 1) % latitude
1788                 c_lat = cells_in_lat_range(i);
1789                 for j = 1:size(grid1, 2) % longitude
1790                     if any(grid1{i, j})
1791                         
1792                         % find indices of points in this grid-cell
1793                         in_grid1 = grid1{i, j};
1794                         
1795                         % which range of grid-cells to look for collocations?
1796                         
1797                         % for longitude (cols), depends on cells_in_range
1798                         cols = j-c_lat:j+c_lat;
1799                         cols = unique(mod(cols, n_grid_lons));
1800                         cols(cols==0) = n_grid_lons;
1801 
1802                         % for latitude (rows), no dependence on lat
1803 
1804                         rows = i-c_lon:i+c_lon;
1805                         % edges do not wrap: if rows < 1
1806                         rows(rows<1) = [];
1807                         rows(rows>n_grid_lats) = [];
1808                         
1809                         % find indices of points in nearby grid-cells to consider
1810                         in_grid2 = grid2(rows, cols);
1811                         % flatten and convert to an array
1812                         in_grid2 = vertcat(in_grid2{:});
1813                         if isempty(in_grid2)
1814                             continue
1815                         end           
1816                         
1817                         % find combinations of points that are near each other
1818                         for p1 = in_grid1'
1819                             index1 = rows1f(p1);
1820                             colno1 = cols1f(p1);
1821                             shorttime = in_grid2(abs(t2f(in_grid2) - t1f(p1)) < self.interval);
1822                             near = shorttime(sphdist(lat1(p1), long1(p1), ...
1823                                 lat2(shorttime), long2(shorttime), ...
1824                                 earth_radius) < self.distance);
1825 
1826                             nnear = length(near);
1827                             if nnear
1828                                 index2 = rows2f(near)';
1829                                 colno2 = cols2f(near)';
1830                                 collocations(count+1:count+nnear, :) = ...
1831                                     [repmat(index1, [nnear 1]) repmat(colno1, [nnear 1]) ...
1832                                     index2 colno2];
1833                                 count = count + nnear;
1834                             end
1835                             if count > .8*size(collocations, 1) % pre-allocate more
1836                                 collocations = [collocations; ...
1837                                     zeros(size(collocations, 1), 4, 'uint32')]; %#ok<AGROW>
1838                             end
1839                         end
1840                     end
1841                 end % for all grid rows
1842             end % for all grid columns
1843 
1844             % removing 0's
1845             collocations(collocations(:, 1)==0, :)=[];
1846         end
1847                         
1848         %% overload parent methods
1849 
1850         function line = granule_first_line(~, varargin)
1851             line = uint32(1); % collocated datasets do not contain duplicates
1852         end
1853 
1854     end
1855     
1856     methods (Access = {?SatDataset})
1857         function [fields, additionals, additional_fields, deal_index] = deal_fields(self, all_fields, ads)
1858             % deal fields over core, additionals, etc.
1859             %
1860             % FORMAT
1861             %
1862             %   [fields, additionals, additional_fields, deal_index] = ...
1863             %       deal_fields(all_fields)
1864             %
1865             % IN
1866             %
1867             %   all_fields  cell-array  all fields to be sorted
1868             %   ads         cell-array of associated datasets that must be
1869             %               a subset of self.associated
1870             %
1871             % OUT
1872             %
1873             %   fields_core         cell array  fields in core
1874             %   additionals         cell array  <a href="matlab:help
1875             %                       AssociatedDataset">AssociatedDataset</a>s
1876             %   additional_fields   cell array of cell array of fields
1877             %   deal_index          vector
1878             
1879             n_f = 0;
1880             n_add = zeros(size(self.associated));
1881             fields = {};
1882             additionals = {};
1883             additional_fields = {};
1884             existing_names = {};
1885             % pre-allocate so that every field is at least an empty
1886             % cell-array
1887             %additional_fields = cell(size(self.associated));
1888             %for i = 1:length(self.associated)
1889             %    additional_fields{i} = {};
1890             %end
1891             deal_index = nan*zeros(size(all_fields));
1892             for i = 1:length(all_fields)
1893                 field = all_fields{i};
1894                 if any(strcmp(fieldnames(self.members), field))
1895                     % belongs to the core
1896                     n_f = n_f + 1;
1897                     fields{n_f} = field; %#ok<AGROW>
1898                     deal_index(i) = 0;
1899                 else
1900                     % check if it belongs to any associated
1901                     found = false;
1902 %                    additionals = cell(size(ads));
1903 %                    existing_names = cell(size(ads));
1904                     for j = 1:length(ads)
1905                         asso = ads{j};
1906                         if any(strcmp(fieldnames(asso.members), field))
1907                             n_add(j) = n_add(j) + 1;
1908                             already_exists = cellfun(@(X) any(strcmp(field, X)), additional_fields);
1909                             if any(already_exists)
1910                                 error(['atmlab:' mfilename], ...
1911                                     ['More than one additional contains field ' ...
1912                                      field '. Found in: ' ...
1913                                      additionals{already_exists}.name ' and ' ...
1914                                      asso.name '.\n' ...
1915                                      'Now I don''t know where to grab it from. ' ...
1916                                      'Please pass subset of AssociatedDatasets on to CollocatedDataset.read (' ...
1917                                      'see <a href="matlab:help CollocatedDataset/read">help</a>)']);
1918                             end
1919                             
1920                             additional_fields{j}{n_add(j)} = field;  %#ok<AGROW>
1921                             % check if this additionaldataset is already
1922                             % known, if yes, find index in existing names
1923                             %                           existing_names = cellfun(@(X) X.name, additionals, 'UniformOutput', false);
1924                             if ~any(strcmp(existing_names, asso.name))
1925                                 %                               additionals = {additionals{:} asso}; %#ok<CCAT>
1926                                 additionals{j} = asso;
1927                                 existing_names{j} = asso.name;
1928                             end
1929                             samename = strcmp(existing_names, asso.name);
1930 %                            assert(sum(samename)<=1, ['atmlab:' mfilename], ...
1931 %                                ['More than one additional contains field ' ...
1932 %                                asso.name]);
1933                             deal_index(i) = find(samename);
1934                             found = true;
1935                         end
1936                     end
1937                     if ~found
1938                         error(['atmlab:' mfilename], ...
1939                             'Field %s was not found in core or in any additional dataset', ...
1940                             field);
1941                     end
1942                 end
1943             end
1944             % don't have empty things
1945             additional_fields = additional_fields(~cellfun(@isempty, additional_fields));
1946             additionals = additionals(~cellfun(@isempty, additionals));
1947             assert(length(additionals)==length(additional_fields), ...
1948                 ['atmlab:' mfilename ':BUG'], ...
1949                 'length(additionals) == %d != length(additional_fields) == %d, bug?', ...
1950                 length(additionals), length(additional_fields));
1951 %             assert(logical(exist('fields','var')),['atmlab:' mfilename ':notImplemented'],...
1952 %                 'No fields from the core dataaset are listed in all_fields')
1953             
1954         end
1955         
1956         function cleanup_log(self)
1957             if ~self.success
1958                 logtext(self.log, 'Collocations ended prematurely\n');
1959             end
1960             fclose(self.log);
1961         end
1962         
1963         function verify_addis(self, addis)
1964             % verify that all those additionals are actually known to me
1965             
1966             for i = 1:length(addis)
1967                 addi = addis{i};
1968                 knownnames = cellfun(@(x) x.name, self.associated, 'UniformOutput', false);
1969                 if ~any(strcmp(addi.name, knownnames))
1970                     
1971                     error(['atmlab:' mfilename ':invalid'], ...
1972                         ['Hi!  This is %s %s speaking. ' ...
1973                         ' You asked to get %s %s as an AssociatedDataset, but you can only pass' ...
1974                         ' AssociatedDatasets that were created in my honour. ' ...
1975                         ' %s %s was created for %s %s. ' ...
1976                         ' As far as I know, mine are: %s'], ...
1977                         class(self), self.name, ...
1978                         class(addi), addi.name, ...
1979                         class(addi), addi.name, ...
1980                         class(addi.parent), addi.parent.name, ...
1981                         strjoin(knownnames, ', '));
1982                 end
1983             end
1984         end
1985         
1986         function verify_addi_granule_consistency(self, attr_core, attr_addi, addi, date)
1987             % verify consistency between addi granule and core granule
1988             %
1989             % Note that this is different from verify_addis. Verify_addis
1990             % just checks the AssociatedDataset objects, whereas this one
1991             % verifies that the actual AssociatedDataset granule is
1992             % consistent with a CollocatedDataset granule
1993             if ~strcmp(attr_core.id, attr_addi.parent_id)
1994                 error(['atmlab:' mfilename ':inconsistent'], ...
1995                     ['Upon reading %s %s with %s %s for date %s, ' ...
1996                     'found mismatch. %s %s was generated for ' ...
1997                     '%s %s with id ''%s'', but %s %s has id ''%s'''], ...
1998                     class(self), self.name, ...
1999                     class(addi), addi.name, ...
2000                     datestr(datenum(date)), ...
2001                     class(addi), addi.name, ...
2002                     class(self), self.name, ...
2003                     attr_addi.parent_id, ...
2004                     class(self), self.name, ...
2005                     attr_core.id);
2006             end
2007         end
2008         
2009     end
2010     
2011     methods (Access = {?SatDataset})
2012         function add_associated(self, ad)
2013             % Add an associated dataset INTERNAL USE ONLY
2014             %
2015             % FORMAT
2016             %
2017             %   cd.add_associated(ad);
2018             
2019             self.associated = {self.associated{:} ad}; %#ok<CCAT>
2020         end
2021         
2022         %% overload parent methods
2023         
2024         function [data, attr] = read_homemade_granule(self, fullpath, fields)
2025             core_fields = {'LAT1', 'LON1', 'TIME1'};
2026             
2027             errID = ['atmlab:' mfilename ':error'];
2028             % core
2029             [data, attr] = read_homemade_granule@HomemadeDataset(self, fullpath, core_fields);
2030             
2031             [~, additionals1, additional_fields1] = self.deal_fields(fields, self.associated);
2032             info = self.find_info_from_granule(fullpath);
2033             for i = 1:length(additionals1) % this is where the AssociatedDatasets are
2034                 tmppath = additionals1{i}.find_granule_by_datetime(...
2035                     [str2double(info.year),str2double(info.month),str2double(info.day)],...
2036                     info.satname);
2037                 [tmpS, tmpA] = additionals1{i}.read_homemade_granule(tmppath, additional_fields1{i});
2038                 self.verify_addi_granule_consistency(attr, tmpA, additionals1{i});
2039                 attr.(self.associated{1}(i).name) = tmpA;
2040                 tmpS.(['path_' self.associated{1}(i).name]) = tmpS.path;
2041                 data = additionals1{i}.merge_struct(data,rmfield(tmpS,{'path','epoch'}));
2042             end
2043             assert(all(ismember([core_fields(:);fields(:)],fieldnames(data))),errID,'Some field(s) is(are) missing')
2044             
2045             % collocations have time in unixsecs, but datasets have time in
2046             % seconds since start of day, which happens to equal data.epoch
2047             data.time = data.TIME1 - data.epoch;
2048             data.lat = data.LAT1;
2049             data.lon = data.LON1;
2050             data.version = ['COL(' attr.primary_version ', ' attr.secondary_version ')'];
2051             
2052         end
2053         
2054     end
2055     
2056     methods (Access = protected)
2057         %% implement new methods
2058         
2059         function M = process(self, collocations, data1, date1, spec1, data2, date2, spec2)
2060             % Process core collocations
2061             %
2062             % This method processes the output of <a href="matlab:help collocate">collocate</a>
2063             % and converts it to a matrix that is passed on for further
2064             % processing.
2065             %
2066             % This is an internal method not normally called by the end
2067             % user.
2068             %
2069             % FORMAT
2070             %
2071             %    M = cd.process(collocs, data1, date1, spec1, ...
2072             %                            data2, date2, spec2);
2073             %
2074             % IN
2075             %
2076             %   collocs     matrix      as output by <a href="matlab:help collocate</a>collocate</a>
2077             %   data1       struct      as returned by primary <a href="matlab:help SatDataset/reader">reader</a>
2078             %   date1       datevec     datevec for primary
2079             %   spec1       various     spec for primary (sat, sat pair, ...)
2080             %   data2       struct      as returned by secondary <a href="matlab:help SatDataset/reader">reader</a>
2081             %   date2       datevec     datevec for secondary
2082             %   spec2       various     spec for secondary
2083             %
2084             % OUT
2085             %
2086             %   M           matrix      collocations, columns described by
2087             %                           <a href="matlab:help CollocatedDataset/cols">self.cols</a>
2088             
2089             % NOTE: time is assumed to be 1D
2090             
2091             n = size(collocations, 1);
2092             X = mat2cell(collocations, size(collocations, 1), [1 1 1 1]);
2093             
2094             c = self.cols;
2095             nfields = length(fieldnames(c));
2096             M = nan*zeros(n, nfields, self.mattype);
2097             [line1, pos1, line2, pos2] = X{:};
2098             
2099             % index for direct addressing
2100             i1 = sub2ind(size(data1.lat), line1, pos1);
2101             i2 = sub2ind(size(data2.lat), line2, pos2);
2102             
2103             M(:, c.LINE1) = line1;
2104             M(:, c.LINE2) = line2;
2105             
2106             M(:, c.POS1) = pos1;
2107             M(:, c.POS2) = pos2;
2108             
2109             M(:, c.LAT1) = data1.lat(i1);
2110             M(:, c.LON1) = data1.lon(i1);
2111             
2112             M(:, c.LAT2) = data2.lat(i2);
2113             M(:, c.LON2) = data2.lon(i2);
2114             
2115             M(:, c.START1) = self.primary.get_starttime(date1);
2116             M(:, c.START2) = self.secondary.get_starttime(date2);
2117             
2118             M(:, c.TIME1) = double(data1.epoch) + double(data1.time(line1));
2119             M(:, c.TIME2) = double(data2.epoch) + double(data2.time(line2));
2120             
2121             M(:, c.DIST) = sphdist(M(:, c.LAT1), M(:, c.LON1), ...
2122                                    M(:, c.LAT2), M(:, c.LON2), ...
2123                                    constants('EARTH_RADIUS')/1e3);
2124             
2125             M(:, c.INT) = M(:, c.TIME2) - M(:, c.TIME1);
2126 
2127             %% sort, needed for averaging stuff later
2128 
2129             M = sortrows(M, [c.START1 c.LINE1 c.POS1 c.START2 c.LINE2 c.POS2]);
2130             
2131             
2132             %% remove duplicates
2133             
2134             first1 = self.primary.granule_first_line(date1, spec1);
2135             first2 = self.secondary.granule_first_line(date2, spec2);
2136             wrongrows = (M(:, c.LINE1) < first1) | ...
2137                 (M(:, c.LINE2) < first2);
2138             logtext(atmlab('OUT'), ['Removing %d scanlines primary before %d, ', ...
2139                 'or secondary before %d\n'], ...
2140                 sum(wrongrows), first1, first2);
2141             M(wrongrows, :) = [];
2142             
2143             % add row-number after all else, here, but REDO just before
2144             % storing!
2145             if ~isempty(M)
2146                 M(:, c.INDEX) = 1:size(M, 1);
2147             end
2148         end
2149         
2150         function nm = calculate_name(self)
2151             nm = [class(self) '__1_' self.primary.name '__2_' self.secondary.name];
2152         end
2153         
2154         function s = calc_cache_key(self, date_start, date_end, spec, all_fields, limits, filters_by_name, ads, no)
2155             % calculate a unique cache-key based on inputs
2156             
2157             % first generate an array of int8
2158             if isempty(spec)
2159                 sp = uint8(0);
2160             elseif ischar(spec)
2161                 sp = uint8(spec);
2162             else
2163                 sp = cellfun(@uint8, spec, 'UniformOutput', false);
2164                 sp = [sp{:}];
2165             end
2166             af = cellfun(@uint8, all_fields, 'UniformOutput', false);
2167             af = [af{:}];
2168             ff = cellfun(@(x) uint8([func2str(x{1}) horzcat(x{2}{:})]), filters_by_name, 'UniformOutput', false);
2169             ff = [ff{:}];
2170             ads = cellfun(@(x) uint8(x.name), ads, 'UniformOutput', false);
2171             ads = [ads{:}];
2172             D = [uint8(self.name), ...
2173                  typecast(uint16(date_start), 'uint8'), ...
2174                  typecast(uint16(date_end), 'uint8'), ...
2175                  sp, af, ff, ads, ...
2176                  uint8(struct2string_compact(limits)), ...
2177                  uint8(no)];
2178                         
2179             md = java.security.MessageDigest.getInstance('md5');
2180             md.update(D);
2181             digest = md.digest;
2182             s = genvarname(sprintf('%02x', typecast(digest, 'uint8')));
2183 
2184         end
2185          
2186     end
2187     
2188     methods (Access = private)
2189          function [addi, cc] = fill_addi(self, result, date, spec, spec1, spec2, additionals, i, addi, info_addi, cc, addisprocess, addisvalid)
2190             % fill up additional no. i.
2191             logtext(atmlab('OUT'), 'Collecting additionals for %s %s\n', class(additionals{i}), additionals{i}.name);
2192             [depdata, depcols] = self.fix_dependencies(additionals{i}, additionals, addi, cc);
2193             
2194             %logtext(atmlab('OUT'), 'Fields to process: %s\n', strjoin(vec2row(addisprocess{i}), ', '));
2195             if isequal(addisprocess, 'all')
2196                 logtext(atmlab('OUT'), 'Processing all fields\n');
2197             else
2198                 logtext(atmlab('OUT'), 'Fields to process: %s\n', strjoin(addisprocess{i}(:)', ', ')); % much quicker and to avoid "Warning: Cannot convert zero-sized vector to row "
2199             end
2200             [addii, cci] = additionals{i}.process_delayed(result, spec1, spec2, depdata, cc, addisprocess{i});
2201             curstate = additionals{i}.overwrite;
2202             if ~addisvalid(i)
2203                 additionals{i}.overwrite = 1;
2204             end
2205             additionals{i}.store(date, spec, addii, info_addi, cci);
2206             additionals{i}.overwrite = curstate; %#ok<NASGU> M-Lint is simply wrong here.
2207             if isempty(addi{i})
2208                 addi{i} = addii;
2209                 cc{i} = cci;
2210             else
2211                 addi{i} = [addi{i}, addii];
2212                 cc{i} = catstruct(cc{i}, structfun(@(X)X+max(cellfun(@(X)X, struct2cell(cc{i}))), cci, 'UniformOutput', false));
2213             end
2214             
2215          end
2216     end
2217    
2218     methods (Static, Access = private)
2219         function sorted = sort_additionals(additionals)
2220             % sort additionals per dependencies
2221             %
2222             % Taking into account dependencies, sort the <a href="matlab:help AdditionalDataset">AdditionalDataset</a>s
2223             % in an appropiate order.
2224             %
2225             % FORMAT
2226             %
2227             %   sorted_addis = cd.sort_additionals(unsorted_addis)
2228             %
2229             % IN
2230             %
2231             %   cell array of <a href="matlab:help AdditionalDataset">AdditionalDataset</a>s
2232             %
2233             % OUT
2234             %
2235             %   cell array of <a href="matlab:help AdditionalDataset">AdditionalDataset</a>s
2236             oldidx = zeros(size(additionals));
2237             for k = 1:100
2238                 for i = 1:length(additionals)
2239                     if ~isempty(additionals{i}.dependencies)
2240                         additionals{i}.priority = max(cellfun(@(x) x.priority, additionals{i}.dependencies))+1;
2241                     else
2242                         additionals{i}.priority = 0;
2243                     end
2244                 end
2245                 [~, idx] = sort(cellfun(@(x) x.priority, additionals));
2246                 if isequal(idx, oldidx)
2247                     break
2248                 end
2249                 oldidx = idx;
2250             end
2251             if k==100
2252                 error(['atmlab:' mfilename], ...
2253                     'Unable to sort additionals. This might indicate a circular dependency --- or a bug');
2254             end
2255             sorted = additionals(idx);
2256         end
2257         
2258         function [depdata, depcols] = fix_dependencies(additional_current, additional_all, results_so_far, varargin)
2259             % put data in cell array for dependencies
2260             %
2261             % This static method assures that requirements for additional X
2262             % (first argument) are met, given other additionals (second
2263             % argument) and results so far (third argument).
2264             %
2265             % FORMAT
2266             %
2267             %   depdata = cd.fix_dependencies(this_additional, all_additionals, result_so_far)
2268             %
2269             % IN
2270             %
2271             %   this_additional   <a href="matlab:help AdditionalDataset">AdditionalDataset</a> currently being worked on
2272             %   all_additionanls  cell array of all <a href="matlab:help AdditionalDataset">AdditionalDataset</a>s
2273             %   result_so_far     cell array with results of <a href="matlab:help AdditionalDataset">AdditionalDataset</a>s
2274             %                     already processed
2275             %   cc                cell array with cols structures
2276             %                     describing results_so_far
2277             %
2278             % OUT
2279             %
2280             %   depdata           cell array with results to be passed on
2281             %                     to the <a href="matlab:help AdditionalDataset">AdditionalDataset</a>
2282             %                     under consideration
2283             %
2284             %   depcols           cell array with cols structures belonging
2285             %                     to depdata
2286             
2287             cc = optargs(varargin, {cellfun(@(X) X.cols, additional_current.dependencies, 'UniformOutput', false)});
2288             if isempty(additional_current.dependencies)
2289                 depdata = {};
2290                 depcols = struct();
2291             else
2292                 % for each dependency, find out what data to put in
2293                 % depdata
2294                 depdata = cell(size(additional_current.dependencies));
2295                 depcols = cell(size(depdata));
2296                 for di = 1:length(additional_current.dependencies)
2297                     depcy = additional_current.dependencies{di};
2298                     % dependency should match exactly one
2299                     nm_match = cellfun(@(X) strcmp(X.name, depcy.name), additional_all);
2300                     assert(sum(nm_match)==1, ['atmlab:' mfilename], ...
2301                         'Expected one name match. Got %d.', sum(nm_match));
2302                     % if the following line fails, something likely
2303                     % went wrong when sorting the additionals
2304                     % according to their dependencies
2305                     depdata{di} = results_so_far{nm_match};
2306                     depcols{di} = cc{nm_match};
2307                 end
2308             end
2309         end
2310         
2311         function [addisread, addisprocess] = meet_dependencies(additionals, addisread, addisprocess, addishas)
2312             % adapt addisread, addisprocess so that dependencies read at
2313             % least the fields they should.
2314             
2315 
2316             % FIXME: should consider the fields that are wanted
2317             for i = 1:length(additionals)
2318                 if isempty(addisprocess{i})
2319                     continue
2320                 end
2321                 %addi = additionals{i};
2322                 for j = 1:length(additionals{i}.dependencies)
2323                     met = false;
2324                     for k = 1:(i-1)
2325                         if isequal(additionals{k}.name, additionals{i}.dependencies{j}.name)
2326                             met = true;
2327                             needs = additionals{i}.fields_needed_for_dependency(addisprocess{i}, additionals{i}.dependencies{j});
2328                             % make sure it reads what it has to read
2329                             
2330                             % FIXME 2013-06-11: should I be reading the
2331                             % netcdf header here to see what is already
2332                             % present?  Changing to simply put in
2333                             % addisread{k} what it needs.  What if a
2334                             % dependency is not yet there, but still to be
2335                             % processed?  Will this work out?
2336                             
2337 %                             has_needed = ismember(needs, addishas{k});
2338 %                             proc_needed = ismember(needs, addisprocess{k});
2339 %                             if ~all(has_needed | proc_needed)
2340 %                                 error(['atmlab:' mfilename ':missing'], ...
2341 %                                     '%s %s needs fields %s from %s %s, but those seem absent', ...
2342 %                                     class(additionals{i}), additionals{i}.name, ...
2343 %                                     strjoin(vec2row(needs), ', '), ...
2344 %                                     class(additionals{k}), additionals{k}.name);
2345 %                             end
2346                             %addisread{k} = union(addisread{k}, needs(has_needed));
2347                             addisread{k} = union(addisread{k}, needs);
2348 %                             addisprocess{k} = union(addisprocess{k}, needs(~has_needed));
2349                         end
2350                     end
2351                     if ~met
2352                         error(['atmlab:' mfilename ':unsatisfied_dependency'], ...
2353                             '%s %s depends on %s %s, but the latter was not requested :(', ...
2354                             class(additionals{i}), additionals{i}.name, ...
2355                             class(additionals{i}.dependencies{j}), additionals{i}.dependencies{j}.name);
2356                     end
2357                 end
2358             end
2359         end
2360         
2361     end
2362 end

Generated on Mon 15-Sep-2014 13:31:28 by m2html © 2005