icollect
- FileSet.icollect(start=None, end=None, files=None, **kwargs)[source]
Load all files between two dates sorted by their starting time
Does the same as
collect()
but works as a generator. Instead of loading all files at the same time, it loads them in chunks (the chunk size is defined by max_workers). Hence, this method is less memory space consuming but slower thancollect()
. Simple hint: use this in for-loops but if you need all files at once, usecollect()
instead.- Parameters:
start – The same as in
find()
.end – The same as in
find()
.files – If you have already a list of files that you want to process, pass it here. The list can contain filenames or lists (bundles) of filenames. If this parameter is given, it is not allowed to set start and end then.
**kwargs – Additional keyword arguments that are allowed for
imap()
. Some might be overwritten by this method.
- Yields:
A tuple of the FileInfo object of a file and its content. These tuples are yielded sorted by its file starting time.
Examples:
## Perfect for iterating over many files. for content in fileset.icollect("2018-01-01", "2018-01-02"): # do something with file and content... ## If you want to have all files at once, do not use this: data_list = list(fileset.icollect("2018-01-01", "2018-01-02")) # This version is faster: data_list = fileset.collect("2018-01-01", "2018-01-02")