ContainerTask#
- class caput.pipeline.tasklib.base.ContainerTask[source]#
Bases:
MPILoggedTask,caput.pipeline.extensions.ContainerIOMixinImplements a task whose inputs and outputs are
Containerobjects.This task implements writing of the output when requested, and handles various types of metadata associated with the container objects.
Tasks inheriting from this class should override
process()and optionallysetup()orprocess_finish(). They should not overridenext()orfinish().Output will be written (using
write_output()) to the fileself.output_name.- Attributes:
- savebool |
list[bool], optional Whether to save the output to disk or not. Can be provided as a list if multiple outputs are being handled. Default is False.
- attrs
dict|None, optional A mapping of attribute names and values to set in the .attrs at the root of the output container. String values will be formatted according to the standard Python .format(…) rules, and can interpolate several other values into the string. These are:
count: an integer giving which iteration of the task is this.
- tag: a string identifier for the output derived from the
containers tag attribute. If that attribute is not present count is used instead.
key: the name of the output key.
task: the (unqualified) name of the task.
input_tags: a list of the tags for each input argument for the task.
Any existing attribute in the container can be interpolated by the name of its key. The specific values above will override any attribute with the same name.
Incorrectly formatted values will cause an error to be thrown. Default is
None- tag
str, optional Set a format for the tag attached to the output. This is a Python format string which can interpolate the variables listed under attrs above. For example a tag of “cat{count}” will generate catalogs with the tags “cat1”, “cat2”, etc. Default is {tag}.
- output_name
str|list[str], optional A python format string used to construct the filename. All variables given under attrs above can be interpolated into the filename. Can be provided as a list if multiple output are being handled. Valid identifiers are:
count: an integer giving which iteration of the task is this.
- tag: a string identifier for the output derived from the
containers tag attribute. If that attribute is not present count is used instead.
key: the name of the output key.
task: the (unqualified) name of the task.
- output_root: the value of the output root argument. This is deprecated
and is just used for legacy support. The default value of output_name means the previous behaviour works.
Default is {output_root}{tag}.h5.
- compressionbool |
dict, optional Set compression options for each dataset. Provided as a dict with the dataset names as keys and values for chunks, compression, and compression_opts. Any datasets not included in the dict (including if the dict is empty), will use the default parameters set in the dataset spec. If set to False (or anything that evaluates to False, other than an empty dict) chunks and compression will be disabled for all datasets. If no argument in provided, the default parameters set in the dataset spec are used. Note that this will modify these parameters on the container itself, such that if it is written out again downstream in the pipeline these will be used. Default is
True.- output_root
str, optional Pipeline settable parameter giving the first part of the output path. Deprecated in favour of specifying the output path directly in output_name.
- nan_checkbool, optional
Check the output for NaNs (and infs) logging if they are present. Default is
True.- nan_dumpbool, optional
If NaN’s are found, dump the container to disk. Default is
True.- nan_skipbool, optional
If NaN’s are found, don’t pass on the output. Default is
True.- versions
dict[str,str], optional Keys are module names (str) and values are their version strings. This is attached to output metadata. Default is {}.
- pipeline_config
dict, optional Global pipeline configuration. This is attached to output metadata. Default is {}.
- savebool |
- Raises:
PipelineRuntimeErrorIf this is used as a baseclass to a task overriding self.process with variable length or optional arguments.