deep_group_copy#

caput.memdata.deep_group_copy(g1, g2, selections=None, convert_dataset_strings=False, convert_attribute_strings=True, file_format=fileformats.HDF5, skip_distributed=False, postprocess=None, shallow=False, shared=None)[source]#

Copy full data tree from one group to another.

Copies from g1 to g2:

  • The default behaviour creates a deep copy of each dataset.

  • If g2 is on disk, the behaviour is the same as making a deep copy. In this case, both shallow and shared are ignored.

  • Otherwise, when shallow is False, datasets not listed in shared are fully deep copied and any datasets in shared will point to the object in g1 storage.

An axis downselection can be specified by supplying the parameter ‘selections’. For example to select the first two indexes in g1[“foo”][“bar”], do:

>>> g1 = MemGroup()
>>> foo = g1.create_group("foo")
>>> ds = foo.create_dataset(name="bar", data=np.arange(3))
>>> g2 = MemGroup()
>>> deep_group_copy(g1, g2, selections={"foo/bar": slice(2)})
>>> [int(x) for x in g2["foo"]["bar"]]
[0, 1]

Axis downselections cannot be applied to shared datasets.

Parameters:
g1GroupLike

Deep copy from this group.

g2GroupLike

Deep copy to this group.

selectionsdict | None, optional

If this is not None, it should have a subset of the same hierarchical structure as g1, but ultimately describe axis selections for group entries as valid numpy indexes. Selections cannot be applied to shared datasets.

convert_attribute_stringsbool, optional

Convert string attributes (or lists/arrays of them) to ensure that they are unicode.

convert_dataset_stringsbool, optional

Convert strings within datasets to ensure that they are unicode.

file_formatFileFormat

File format to use. Default HDF5.

skip_distributedbool, optional

If True skip the write for any distributed dataset, and return a list of the names of all datasets that were skipped. If False (default) throw a ValueError if any distributed datasets are encountered.

postprocesscallable() | None, optional

A function which is called on each node, with the source and destination entries, and can modify either.

shallowbool, optional

Explicitly share all datasets. This will only alter behaviour when copying from memory to memory. If False, any dataset listed in shared will NOT be copied. Default is False.

sharedSequence| None, optional

Sequence (list, set, generator) of datasets to share, if shallow is False. Shared datasets just point to the existing object in g1 storage. Axis selections cannot be applied to shared datasets. Ignored if shallow is True, since, in that case, _all_ datasets are shared.

Returns:
distributed_dataset_nameslist[str] | None

Names of the distributed datasets if skip_distributed is True. Otherwise None is returned.