NestedDataset#

class torch_brain.dataset.NestedDataset(datasets, transform=None)[source]#

Bases: torch_brain.dataset.dataset.Dataset

Dataset that composes multiple Dataset instances under a single interface.

Each child dataset is namespaced by a string prefix (its dataset name). Exposed recording_ids therefore take the form "<dataset_name>/<recording_id>". The nested dataset behaves like a regular Dataset, dispatching all operations to the appropriate child dataset based on this prefix.

Instances of NestedDataset can themselves be nested inside other NestedDataset objects, allowing for arbitrary-depth hierarchies of datasets while preserving the same prefix-based naming convention.

See Namespacing: for how Data attributes are namespaced.

Parameters:
  • datasets (Union[Iterable[Dataset], Mapping[str, Dataset]]) – Either a mapping from dataset name to Dataset instance, or a list/tuple of Dataset instances. When a list/tuple is given, dataset names are inferred from the class names of the datasets. In this case, duplicate class names are not allowed.

  • transform (Optional[Callable]) – Optional transform that is applied to samples in __getitem__().

property datasets: dict[str, Dataset]#

The underlying mapping from dataset name to Dataset.

get_recording(recording_id, _namespace='')[source]#

Return a full Data recording from the appropriate child dataset.

Parameters:
  • recording_id (str) – Recording identifier of the form "<dataset_name>/<recording_id>".

  • _namespace (str) – Internal namespace string propagated to child datasets. End users normally do not need to pass this explicitly.

Returns:

The selected Data object.

Return type:

Data

Raises:

ValueError – If the recording_id does not contain a dataset prefix.

__getitem__(index)[source]#

Return a sample specified by a DatasetIndex.

The index.recording_id must include a dataset prefix of the form "<dataset_name>/<recording_id>". The index is rewritten to strip the prefix and forwarded to the selected child dataset. If a transform was provided at construction time, it is applied to the resulting sample before returning it.

Parameters:

index (DatasetIndex) – DatasetIndex containing the full nested recording_id.

Returns:

The sampled Data object from the correct sub-dataset.

Return type:

Data

Raises:

ValueError – If index.recording_id does not contain a dataset prefix.

get_sampling_intervals(*args, **kwargs)[source]#

Return sampling intervals for all recordings across child datasets.

Any positional and keyword arguments are forwarded to the underlying datasets’ get_sampling_intervals() methods. Keys in the returned dictionary are prefixed with the corresponding dataset name so that they match the nested "<dataset_name>/<recording_id>" convention.

Returns:

Mapping from nested recording id to interval for all contained datasets.

Return type:

dict[str, Interval]