Dataset

class typhon.datasets.dataset.Dataset(**kwargs)[source]

Represents a dataset.

This is an abstract class. More specific subclasses are SingleFileDataset and MultiFileDataset.

To add a dataset, subclass one of the subclasses of Dataset, such as MultiFileDataset, and implement the abstract methods.

Dataset objects have a limited number of attributes. To limit the occurence of bugs, dynamically setting non-pre-existing attributes is limited. Attributes can be set either by passing keyword arguments when creating the object, or by setting the appropriate field in your typhon configuration file (such as .typhonrc). The configuration section will correspond to the object name, the key to the attribute, and the value to the value assigned to the attribute. See also typhon.config.

start_date

datetime.datetime or numpy.datetime64 – Starting date for dataset. May be used to search through ALL granules. WARNING! If this is set at a time t_0 before the actual first measurement t_1, then the collocation algorith (see CollocatedDataset) will conclude that there are 0 collocations in [t_0, t_1], and will not realise if data in [t_0, t_1] are actually added later!

end_date

datetime.datetime or numpy.datetime64 – Similar to start_date, but for ending.

name

str – Name for the dataset. Used to make sure there is only a single dataset with the same name for any particular dataset. If a dataset is initiated with a pre-exisitng name, the previous product is called.

aliases

Mapping[str, str] – Aliases for field. Dictionary can be useful if you want to programmatically loop through the same field for many different datasets, but they are named differently. For example, an alias could be “ch4_profile”.

unique_fields

Container[str] – Set of fields that make any individual measurement unique. For example, the default value is {“time”, “lat”, “lon”}.

related

Mapping[str, Dataset] – Dictionary whose keys may refer to other datasets with related information, such as DMPs or flags.

__init__(**kwargs)[source]

Initialise a Dataset object.

All keyword arguments will be translated into attributes. Does not take positional arguments.

Note that if you create a dataset with a name that already exists, the existing object is returned, but __init__ is still called (Python does this, see https://docs.python.org/3.5/reference/datamodel.html#object.__new__).

Methods

__init__(**kwargs) Initialise a Dataset object.
as_xarray_dataset()
combine(my_data, other_obj[, other_data, …]) Combine with data from other dataset.
find_granules([start, end, include_last_before]) Loop through all granules for indicated period.
find_granules_sorted([start, end]) Yield all granules sorted by starting time then ending time.
find_most_recent_granule_before(instant, …) Find granule covering instant
get_additional_field(M, fld) Get additional field.
read([f, fields, pseudo_fields]) Read granule in file and do some other fixes
read_period([start, end, onerror, fields, …]) Read all granules between start and end, in bulk.
setlocal() Set local attributes, from config or otherwise.

Attributes

aliases
concat_coor
default_orbit_filters
end_date
mandatory_fields
maxsize
my_pseudo_fields
name
read_returns
related
section
start_date
time_field
unique_fields