ContainerPrototype#

class caput.containers.ContainerPrototype(*args: tuple, **kwargs: dict)[source]#

Bases: Container

A prototype class used to design containers using axis and dataset specifications.

This class is designed to do much of the work of setting up pipeline containers. It should be subclassed, with two class variables set: _axes and _dataset_spec. See the Notes section for details.

Optional Parameters#

Some combination of the following parameters must br provided when creating a new container. In particular, the container requires axis definitions to be provided for all axes in _axes, whether by direct keyword arguments, or by copying from another container.

data_groupMemDiskGroup

A container to pass through for making a shallow copy. This is used by routines like concatenate() and generally shouldn’t be used directly. Either a keyword argument, or the first positional argument.

axes_fromContainer, optional

Another container to copy axis definitions from. Must be supplied as keyword argument.

attrs_fromContainer, optional

Another container to copy dataset attributes from. Must be supplied as keyword argument. This applies to attributes in default datasets too.

dsets_fromContainer, optional

A container to copy datasets from. Any dataset which an axis whose definition has been explicitly set (i.e. does not come from axes_from) will not be copied.

copy_fromContainer, optional

Set axes_from, attrs_from and dsets_from to this instance if they are not set explicitly.

skip_datasetsbool, optional

Skip creating datasets. Instead, they will all need to be added manually with add_dataset() regardless of the entry in _dataset_spec. Default is False.

distributedbool, optional

Should this be a distributed container? Defaults to True.

commComm, optional

The MPI communicator to distribute over. Use COMM_WORLD if not set.

allow_chunkedbool, optional

Allow the datasets to be chunked. Default is True.

kwargsAny

Should contain entries for all other axes.

Notes

Inheritance from other ContainerPrototype subclasses should work as expected, with datasets defined in parent classes appearing as expected, and being overridden where they are redefined in the derived class.

The variable _axes should be a tuple containing the names of axes that datasets in this container will use.

The variable _dataset_spec should define the datasets. It’s a dictionary with the names of the datasets as keys. The value for each key should be another dictionary. In that sub-dictionary, the key axes is mandatory and should be a list of the axes the dataset has (these should correspondto entries in _axes), as is the key dtype which should be a datatype understood by numpy. Other possible keys are:

  • initialise : if set to True the dataset will be created as the container is initialised.

  • distributed : the dataset will be distributed if the entry is True, if False it won’t be, and if not set it will be distributed if the container is set to be.

  • distributed_axis : the axis to distribute over. Should be a name given in the axes entry.

property axes: tuple[str, Ellipsis][source]#

The set of axes for this container including any defined on the instance.

property dataset_spec: dict[str, dict][source]#

Return a copy of the fully resolved dataset specifiction as a dictionary.

property datasets: caput.memdata.ro_dict[str, caput.memdata.MemDataset][source]#

A read-only view of the datasets in this container.

Do not try to add a new dataset by adding keys to this property. Use create_dataset instead.

Returns:
datasetsro_dict

Entries are MemDataset datasets.

Methods#

add_dataset(→ caput.memdata.MemDataset)

Add a new, empty dataset to the container.