torch_brain.datasets#

Base classes for creating PyTorch datasets

Module Overview#

This module contains base classes to ease creation of PyTorch datasets for your data.

  • The Dataset class is inherited by all datasets. These handle opening and accessing single datasets.

  • The NestedDataset class is for opening and accessing multiple datasets through a unified interface.

  • Mixin classes are provided to add modality-specific functionalities to the Dataset classes.

Dataset#

torch_brain’s Dataset class (and its sub-classes) allow you to sample time-slices of your data. This is a major deviation from the standard torch.utils.data.Dataset, which is indexed by integers. To achieve arbitrary time-slice based access, our Dataset class is indexed by three things:

  1. The recording id from which you want the slice,

  2. Start time of the slice, and

  3. End time of the slice

These are put into a DatasetIndex object, which is then used to index the Dataset. Since different machine learning applications require different ways of sampling, we provide a collection of samplers which are responsible for creating these DatasetIndex objects.

NestedDataset#

The Dataset class is designed to operate on a single dataset. However, many modern ML methods perform training over multiple datasets. For this, we provide NestedDataset that allows users to open and index through multiple datasets.

Base Classes#

Dataset

PyTorch Dataset for loading time-slices of neural data recordings from HDF5 files.

DatasetIndex

Index for accessing a specific time interval of a recording within a Dataset.

NestedDataset

Dataset that composes multiple Dataset instances under a single interface.

OpenNeuroDataset

Base class for OpenNeuro datasets.

Mixins#

SpikingDatasetMixin

Mixin class for torch_brain.datasets.Dataset subclasses containing spiking data.

CalciumImagingDatasetMixin

Mixin class for torch_brain.datasets.Dataset subclasses containing calcium imaging data.

MultiChannelDatasetMixin

Mixin class for torch_brain.datasets.Dataset subclasses containing multi-channel recordings (e.g., EEG, ECoG, EMG, sEEG, etc).

Electrophysiology Datasets#

PerichMillerPopulation2018

Motor cortex (M1 and PMd) spiking activity and reaching kinematics from four macaques performing center-out and random target reaching tasks.

PeiPandarinathNLB2021

Curated spiking neural activity datasets from the Neural Latents Benchmark 2021 (NLB'21).

FlintSlutzkyAccurate2012

Motor cortex (M1) spiking activity and reaching kinematics from 1 monkey performing center-out reaching tasks.

ChurchlandShenoyNeural2012

Motor cortex (M1 and PMd) spiking activity and reaching kinematics from 2 monkeys performing center-out reaching tasks with right hand.

OdohertySabesNonhuman2017

Motor cortex (M1 and S1) spiking activity and reaching kinematics from 2 monkeys performing random target reaching tasks with right hand.

VollanMoserAlternating2025

Neuropixels recordings from MEC and hippocampus in rats during spatial navigation and sleep.

ShiraziHBNR1DS005505

Shirazi HBN Resting State 1 (HBN-R1) iEEG Dataset (OpenNeuro DS005505).

Calcium Imaging Datasets#

AllenVisualCodingOphys2016

Two-photon calcium imaging of mouse visual cortex from the Allen Brain Observatory Visual Coding dataset, recorded during presentation of visual stimuli.

iEEG Datasets#

Neuroprobe2025

Neuroprobe 2025 iEEG benchmark dataset.

KochiVisualNamingDS006914

Kochi Visual Naming iEEG Dataset (OpenNeuro DS006914).

EEG Datasets#

KlinzingSleepDS005555

Klinzing Sleep iEEG Dataset (OpenNeuro DS005555).

PSG Datasets#

KempSleepEDF2013

Sleep-EDF Database Expanded containing 197 whole-night polysomnographic sleep recordings.