split_dataset package

Submodules

split_dataset.blocks module

class split_dataset.blocks.BlockIterator(blocks, slices=True)[source]

Bases: object

class split_dataset.blocks.Blocks(shape_full: Tuple, shape_block: Optional[Tuple] = None, dim_split: Optional[int] = None, blocks_number: Optional[int] = None, padding: Union[int, Tuple] = 0, crop: Optional[Tuple] = None)[source]

Bases: object

Blocks have two indexing systems:
  • linear:

  • cartesian: gives the position of the block in the general block tiling.

block_containing_coords(coords)[source]

Find the linear index of a block containing the given coordinates

Parameters

coords – a tuple of the coordinates

Returns

static block_to_slices(block)[source]
blocks_to_take(start_take, end_take)[source]

Find which blocks to take to cover the range: :param start_take: starting points in the N dims (tuple) :param end_take: ending points in the N dims (tuple) :return: tuple of tuples with the extremes of blocks to take in N dims;

starting index of data in the first block; ending index of data in the last block.

cartesian_to_linear(ca_idx)[source]

Convert block cartesian index in linear index. Example: in a 3D stack split in 2x2x3 blocks

self.cartesian_to_linear0,0,0) = 0 # first block bs.cartesian_to_linear(1,1,2) = 11 # last block

Parameters

ca_idx – block cartesian index (tuple of ints)

Returns

block linear index (int)

centres()[source]
property crop
drop_dim(dim_to_drop)[source]

Return a new BlockSplitter object with a dimension dropped, useful for getting spatial from spatio-temporal blocks.

Parameters

dim_to_drop – dimension to be dropped (int)

Returns

new BlockSplitter object

linear_to_cartesian(lin_idx)[source]

Convert block linear index into cartesian index. Example: in a 3D stack split in 2x2x3 blocks,

self.linear_to_cartesian(0) = (0,0,0) # first block bs.linear_to_cartesian(11) = (1,1,2) # last block :param lin_idx: block linear index (int) :return: block cartesian index (tuple of ints)

property n_blocks
property n_dims
neighbour_blocks(i_block, dims=None)[source]

Return neighbouring blocks across given dimensions :param i_block: :param dims: :return:

property padding
serialize()[source]

Returns a dictionary with a complete description of the BlockSplitter, e.g. to save its structure as json file. :return:

property shape_block
property shape_full
slices(as_tuples=False)[source]
update_block_structure()[source]

Update the Blocks structure, e.g. when block shape or padding are changed.

update_stack_dims()[source]

Update stack dimensions and cropping, if shape_full or cropping is changed. :return:

split_dataset.split_dataset module

class split_dataset.split_dataset.EmptySplitDataset(root, name, *args, resolution=None, **kwargs)[source]

Bases: split_dataset.blocks.Blocks

Class to initialize an empty dataset for which we have to save metadata after filling its blocks.

finalize()[source]
save_block_data(n, data, verbose=False)[source]

Optional method to save data in a block. Often we don’t use it, as we directly save data in the parallelized function. Might be good to find ways of centralizing saving here? :param n: n of the block we are saving in; :param data: data to be pured in the block; :param verbose: :return:

class split_dataset.split_dataset.SplitDataset(root, prefix=None)[source]

Bases: split_dataset.blocks.Blocks

Manages datasets split over multiple h5 file across arbitrary dimensions. To do so, uses the BlockSplitter class functions, and define blocks as files.

apply_crop(crop)[source]

Take out the data with a crop

as_dask()[source]

Function to create a Dask array from a split dataset. :param dataset: SplitDataset object :return: Dask array

property data_key

To migrate smoothly to removal of stack_ND key in favour of only stack

split_dataset.split_dataset.save_to_split_dataset(data, root_name, block_size=None, crop=None, padding=0, prefix='', compression='blosc')[source]

Function to save block of data into a split_dataset.

Module contents

Top-level package for Split Dataset.