torch_brain.dataset¶
Overview¶
Base classes to ease creation of PyTorch datasets for your data.
The Dataset class is inherited by all datasets. These handle opening and accessing single datasets.
The NestedDataset class is for opening and accessing multiple datasets through a unified interface.
Mixin classes are provided to add modality-specific functionalities to the Dataset classes.
Dataset¶
torch_brain’s Dataset class (and its sub-classes) allow you to sample time-slices of your data.
This is a major deviation from the standard torch.utils.data.Dataset, which is indexed by integers.
To achieve arbitrary time-slice based access, our Dataset class is indexed by three things:
The recording id from which you want the slice,
Start time of the slice, and
End time of the slice
These are put into a DatasetIndex object, which is then used to index the Dataset.
Since different machine learning applications require different ways of sampling, we provide a collection of
samplers which are responsible for creating these DatasetIndex objects.
NestedDataset¶
The Dataset class is designed to operate on a single dataset. However, many modern ML methods perform
training over multiple datasets. For this, we provide NestedDataset that allows users to open and index through
multple datasets.