Collocations

class typhon.collocations.Collocations(*args, reference=None, read_mode=None, collapser=None, **kwargs)[source]

Class for finding and storing collocations between FileSet objects

If you want to find collocations between Arrays, use collocate() instead.

__init__(*args, reference=None, read_mode=None, collapser=None, **kwargs)[source]

Initialize a Collocation object

This FileSet

Parameters:
  • *args – Positional arguments for FileSet.

  • read_mode – The collocations can be collapsed or expanded after collecting. Set this either to collapse (default), expand or compact.

  • reference – If read_mode is collapse, here you can set the name of the dataset to that the others should be collapsed. Default is the primary dataset.

  • collapser – If read_mode is collapse, here you can give your dictionary with additional collapser functions.

  • **kwargs – Keyword arguments for FileSet.

Methods

__init__(*args[, reference, read_mode, ...])

Initialize a Collocation object

add_fields(original_fileset, fields, **kwargs)

param start:

align(other[, start, end, matches, ...])

Collect files from this fileset and a matching other fileset

collect([start, end, files, return_info])

Load all files between two dates sorted by their starting time

copy()

Create a so-called deep-copy of this fileset object

delete([dry_run])

Remove files in this fileset from the disk

detect(test, *args, **kwargs)

Search for anomalies in fileset

dislink(name_or_fileset)

Remove the link between this and another fileset

exclude_files(filenames)

exclude_times(periods)

find([start, end, sort, only_path, bundle, ...])

Find all files of this fileset in a given time period.

find_closest(timestamp[, filters])

Find the closest file to a timestamp

get_filename(times[, template, fill])

Generate the full path and name of a file for a time period

get_info(file_info[, retrieve_via])

Get information about a file.

get_placeholders()

Get placeholders for this FileSet.

icollect([start, end, files])

Load all files between two dates sorted by their starting time

imap(*args, **kwargs)

Apply a function on files and return the result immediately

is_excluded(file)

Checks whether a file is excluded from this FileSet.

link(other_fileset[, linker])

Link this fileset with another FileSet

load_cache(filename)

Load the information cache from a JSON file

make_dirs(filename)

map(func[, args, kwargs, files, on_content, ...])

Apply a function on files of this fileset with parallel workers

match(other[, start, end, max_interval, ...])

Find matching files between two filesets

move([target, convert, copy])

Move (or copy) files from this fileset to another location

parse_filename(filename[, template])

Parse the filename with temporal and additional regular expressions.

read(*args, **kwargs)

Read a file and apply a collapse / expand function to it

reset_cache()

Reset the information cache

save_cache(filename)

Save the information cache to a JSON file

search(filesets[, collocator])

Find all collocations between two filesets and store them in files

set_placeholders(**placeholders)

Set placeholders for this FileSet.

to_dataframe([include_times])

Create a pandas.Dataframe from this FileSet

write(data, file_info[, in_background])

Write content to a file by using the FileSet's file handler.

Attributes

default_handler

name

Get or set the fileset's name.

path

Gets or sets the path to the fileset's files.

time_coverage

Get and set the time coverage of the files of this fileset

year2_threshold