split_dataset package¶
Submodules¶
split_dataset.blocks module¶
-
class
split_dataset.blocks.Blocks(shape_full: Tuple, shape_block: Optional[Tuple] = None, dim_split: Optional[int] = None, blocks_number: Optional[int] = None, padding: Union[int, Tuple] = 0, crop: Optional[Tuple] = None)[source]¶ Bases:
object- Blocks have two indexing systems:
linear:
cartesian: gives the position of the block in the general block tiling.
-
block_containing_coords(coords)[source]¶ Find the linear index of a block containing the given coordinates
- Parameters
coords – a tuple of the coordinates
- Returns
-
blocks_to_take(start_take, end_take)[source]¶ Find which blocks to take to cover the range: :param start_take: starting points in the N dims (tuple) :param end_take: ending points in the N dims (tuple) :return: tuple of tuples with the extremes of blocks to take in N dims;
starting index of data in the first block; ending index of data in the last block.
-
cartesian_to_linear(ca_idx)[source]¶ Convert block cartesian index in linear index. Example: in a 3D stack split in 2x2x3 blocks
self.cartesian_to_linear0,0,0) = 0 # first block bs.cartesian_to_linear(1,1,2) = 11 # last block
- Parameters
ca_idx – block cartesian index (tuple of ints)
- Returns
block linear index (int)
-
property
crop¶
-
drop_dim(dim_to_drop)[source]¶ Return a new BlockSplitter object with a dimension dropped, useful for getting spatial from spatio-temporal blocks.
- Parameters
dim_to_drop – dimension to be dropped (int)
- Returns
new BlockSplitter object
-
linear_to_cartesian(lin_idx)[source]¶ Convert block linear index into cartesian index. Example: in a 3D stack split in 2x2x3 blocks,
self.linear_to_cartesian(0) = (0,0,0) # first block bs.linear_to_cartesian(11) = (1,1,2) # last block :param lin_idx: block linear index (int) :return: block cartesian index (tuple of ints)
-
property
n_blocks¶
-
property
n_dims¶
-
neighbour_blocks(i_block, dims=None)[source]¶ Return neighbouring blocks across given dimensions :param i_block: :param dims: :return:
-
property
padding¶
-
serialize()[source]¶ Returns a dictionary with a complete description of the BlockSplitter, e.g. to save its structure as json file. :return:
-
property
shape_block¶
-
property
shape_full¶
split_dataset.split_dataset module¶
-
class
split_dataset.split_dataset.EmptySplitDataset(root, name, *args, resolution=None, **kwargs)[source]¶ Bases:
split_dataset.blocks.BlocksClass to initialize an empty dataset for which we have to save metadata after filling its blocks.
-
save_block_data(n, data, verbose=False)[source]¶ Optional method to save data in a block. Often we don’t use it, as we directly save data in the parallelized function. Might be good to find ways of centralizing saving here? :param n: n of the block we are saving in; :param data: data to be pured in the block; :param verbose: :return:
-
-
class
split_dataset.split_dataset.SplitDataset(root, prefix=None)[source]¶ Bases:
split_dataset.blocks.BlocksManages datasets split over multiple h5 file across arbitrary dimensions. To do so, uses the BlockSplitter class functions, and define blocks as files.
-
as_dask()[source]¶ Function to create a Dask array from a split dataset. :param dataset: SplitDataset object :return: Dask array
-
property
data_key¶ To migrate smoothly to removal of stack_ND key in favour of only stack
-
Module contents¶
Top-level package for Split Dataset.