MemDatasetDistributed#

class caput.memdata.MemDatasetDistributed(shape, dtype, axis=0, comm=None, chunks=None, compression=None, compression_opts=None, **kwargs)[source]#

Bases: MemDataset

Parallel, in-memory implementation of h5py.Dataset.

Inherits from MemDataset. Encapsulates an MPIArray mocked up to look like an h5py dataset. Similar to h5py datasets, this implements slicing like a numpy array but as it is not actually a many operations won’t work (e.g. ufuncs).

Parameters:
shapetuple[int, …]

Shape of array to initialise. This is the global shape.

dtypedtype

Type of array to create.

axisint, optional

Index of axis to distribute the array over. Default is 0.

commMPI.Comm | None

MPI communicator to distribute over. If None use MPI.COMM_WORLD.

chunkstuple[int, …] | None

Chunk sizes. If None, dataset is not chunked. Default is None.

compressionstr | int | NOne

Name or identifier of HDF5 or Zarr compression filter.

compression_optsdict | None

See HDF5 and Zarr documentation for compression filters. Compression options for the dataset.

**kwargsAny

Arbitrary keyword arguments passed to MemDataset constructor.

property chunks[source]#

Access the chunk shape of the dataset.

property comm[source]#

Reference to the MPI communicator.

property common[source]#

Assert that this is not a common dataset.

property compression[source]#

Access compression information.

property compression_opts[source]#

Access compression options.

property data[source]#

Access the underlying data array.

property distributed: bool[source]#

Assert that this is a distributed dataset.

property distributed_axis[source]#

The index of the axis over which this dataset is distributed.

property dtype[source]#

The numpy data type of the dataset.

property global_shape[source]#

Global shape of the distributed dataset.

The shape of the whole array that is distributed between multiple nodes.

property local_data[source]#

Access the underlying local data as a numpy array.

property local_offset[source]#

Access the local offset of the array on this rank.

property local_shape[source]#

Local shape of the distributed dataset.

The shape of the part of the distributed array that is allocated to this node.

property shape[source]#

Access the global shape of the array.

Methods#

from_mpi_array(data[, chunks, compression, ...])

Initialise from a MPIArray.

redistribute(axis)

Change the axis that the dataset is distributed over.