Skip to content

The implementation of "MBrain: A Multi-channel Self-Supervised Learning Framework for Brain Signals".

Notifications You must be signed in to change notification settings

ilikevegetable/MBrain

Repository files navigation

MBrain: A Multi-channel Self-Supervised Learning Framework for Brain Signals (KDD'23)

Donghong Cai*, Junru Chen*, Yang Yang, Teng Liu, Yafeng Li

Zhejiang University, Nuozhu Technology Co., Ltd.

MBrain is a general multi-channel self-supervised learning framework to unify the representations learning of EEG and SEEG brain signal data.

MBrain

Due to the requirements of our cooperated hospital and company, we are still unable to public our private SEEG data and all the corresponding data pipeline (including dataloader) now.

Data

We use the Temple University Hospital EEG Seizure Corpus (TUSZ) v1.5.2 as our EEG dataset to do seizure detection experiments, which is publicly available here. We used the code from eeg-gnn-ssl to read the EEG data and sampled from the complete dataset to generate the self-supervised training set and downstream task datasets. Running the following code to clone the code from eeg-gnn-ssl:

cd BrainSignalSSL_EEG/data
git clone https://github.com/tsy935/eeg-gnn-ssl.git
mv eeg_sampling.ipynb eeg-gnn-ssl

Then, refer to the code in ./BrainSignalSSL_EEG/data/eeg-gnn-ssl/eeg_sampling.ipynb to generate the dataset for this experiment.

For the emotion recognition experiments conducted in appendix, we use the SJTU Emotion EEG Dataset (SEED), which is publicly available here. After downloading this dataset, you can refer to the code in ./BrainSignalSSL_EEG_Emo/data/eeg_emotion_sampling.ipynb to process the raw data and generate the self-supervised training set and downstream task datasets for experiments.

Experiments

Self-Supervised Learning

To pretrain MBrain in EEG dataset for seizure detection, run:

python ./BrainSignalSSL_EEG/ssl_train.py --data_dir <path_to_your_data> --save_dir <path_to_save_ssl_model>

where <path_to_your_data> and <path_to_save_ssl_model> is the path to your generated EEG dataset and the path to save checkpoints of MBrain.

Similarly, to pretrain MBrain in dataset for emotion recognition, run:

python ./BrainSignalSSL_EEG_Emo/ssl_train.py --data_dir <path_to_your_data> --save_dir <path_to_save_ssl_model>

Downstream Task

After self-supervised pre-training, to do seizure detection downstream task, run:

python ./BrainSignalSSL_EEG/downstream_repeat.py --ssl_dir <path_to_ssl_checkpoint> --data_dir <path_to_your_data> --save_dir <path_to_save_downstream_model>

where <path_to_ssl_checkpoint> is the path to pre-trained checkpoint of MBrain, <path_to_your_data> is the path to your generated EEG dataset, and path_to_save_downstream_model is the path to save checkpoints of downstream model.

This code will do a five times repeat experiment of downstream task. Which means we will train downstream model and test for 5 times using different random seed.

Similarly, after self-supervised pre-training, to do emotion recognition downstream task, run:

python ./BrainSignalSSL_EEG_Emo/downstream_repeat.py --ssl_dir <path_to_ssl_checkpoint> --data_dir <path_to_your_data> --save_dir <path_to_save_downstream_model>

About SEEG

As mentioned in our paper, due to variations in the number of channels and epileptic patterns across different patients, as well as the need to predict epileptic activity at the channel level (unlike EEG, which makes predictions for an entire segment), seizure detection on SEEG is significantly more challenging than on EEG. Moreover, the data processing workflow and the design of the dataloader for the Domain Generalization Experiment are considerably more complex. However, due to the requirements of our cooperated hospital and company, we are currently unable to open-source this part of the dataset and code. If you are also working on self-supervised learning training involving varying numbers of channels and channel-level downstream prediction tasks, we believe the code in ./BrainSignalSSL_SEEG will still provide valuable insights. Lastly, we are committed to continuing efforts toward open-sourcing SEEG data and its associated pipeline.

Citation

Please consider citing the following paper when using this code for your application.

@inproceedings{10.1145/3580305.3599426,
  title = {MBrain: A Multi-channel Self-Supervised Learning Framework for Brain Signals},
  author = {Cai, Donghong and Chen, Junru and Yang, Yang and Liu, Teng and Li, Yafeng},
  booktitle = {Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
  publisher = {Association for Computing Machinery},
  pages = {130–141},
  year = {2023},
  isbn = {9798400701030},
  url = {https://doi.org/10.1145/3580305.3599426},
  doi = {10.1145/3580305.3599426},
  numpages = {12},
  series = {KDD '23}
}

About

The implementation of "MBrain: A Multi-channel Self-Supervised Learning Framework for Brain Signals".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published