NEST: Neural EEG
Sequence Transducer
A transformer-based approach for decoding EEG brain signals into natural language. Achieving state-of-the-art results on the ZuCo benchmark.
We present NEST (Neural EEG Sequence Transducer), a deep learning framework for decoding EEG brain signals into natural language text. NEST uses word-level EEG frequency features (105 channels × 8 bands = 840 dimensions per word) extracted from the ZuCo corpus, projected through a 6-layer transformer encoder and decoded by a fine-tuned BART language model with cross-attention.
We evaluate under a rigorous subject-independent protocol: training on 8 subjects and testing on 2 held-out subjects (ZMG, ZPH). This setup prevents data leakage and provides an unbiased estimate of generalization. Benchmark results are pending completion of full training runs.
We demonstrate that pre-processed EEG frequency features aligned to word reading events contain structured information sufficient for text reconstruction using non-invasive EEG, opening new possibilities for assistive brain-computer interfaces and cognitive neuroscience research.
Benchmark Performance
Evaluated on ZuCo Task 1 (Normal Reading) with held-out subjects.
Comparison with Prior Work
Research Methods
A rigorous three-stage pipeline from data collection to model evaluation.
Data: ZuCo Dataset
12 subjects reading 400+ natural English sentences from Wikipedia and movie reviews. 105 EEG channels at 500Hz. Word-level timing annotations.
Preprocessing Pipeline
Bandpass filter 0.5–100Hz, 50Hz notch, ICA artifact removal. Word-aligned epoch extraction (−200ms to +800ms). Z-score normalization per subject per channel.
Model Architecture
End-to-end transformer with 6-layer EEG encoder (d=512, 8 heads), cross-attention bridge, and 6-layer autoregressive text decoder with 50K vocabulary.
Training Procedure
Adam optimizer, learning rate 1e-4 with cosine schedule. Teacher forcing during training. Beam search (width=5) at inference. 80/10/10 train/val/test split.
Evaluation Metrics
Word Error Rate (WER), BLEU-1/2/3/4, character-level accuracy, and ROUGE-L. All reported on held-out test subjects unseen during training.
Ablation Studies
Encoder depth, attention heads, training data size, preprocessing choices, and cross-attention vs. simple projection. Full ablation table in Appendix B.
Cite This Work
Read the Full Research
The full paper includes detailed ablation studies, attention visualizations, and cross-subject generalization analysis.