Title : Towards neurolinguistic processing transformer using EEG tokenization for smart brain-computer interaction
Abstract:
Transformers, one of the remarkable machine learning techniques, are widely used not only in the artificial intelligence fields due to their remarkable ability to handle the problem of long-term dependencies. In this study, our objective is to develop an electroencephalogram (EEG) transformer that classifies actual speech task, speech imagery task, and rest states using only the EEG signals and to perform part of the speech tag of decoded sentences. The 64 EEG channels were acquired based on 10-20 international system. The EEG data were pre-processed through artifact removal such as eye blink and head movement. The EEG transformer tokenizes and randomly masks EEG tokens, which then undergo token embedding and position embedding processes. Token embedding maps input tokens into a high-dimensional vector space, allowing each token to be represented as a continuous vector. This transformation helps capture the semantic relationships between tokens and enables the model to understand the similarity between tokens. Position embedding provides information about the position of each token in the input sequence. It conveys this positional information to the transformer model, enabling it to understand the sequence of the input and learn temporal dependencies. The joint discriminative and self-supervised generative self-supervised learning framework pre-trained simultaneously conducts generation and discrimination. The discriminative model is trained to predict the ID of masked EEG tokens (e.g., actual speech task (S), speech imagery task (I), and rest states (None)) considering the context, and predicts the part of speech tagging of masked EEG tokens by referencing speech tasks and imagined tasks. The generative model fills in the missing data at the masked positions and complements incomplete information, aiding in the training of the discriminative model. Through this framework, part of speech tagging of EEG speech segments is performed, allowing the distinction between speech tasks, imagined speech tasks, and rest states. To the best of our knowledge, it is the first token-based self-supervised learning framework in speech imagery decoding using EEG. Speech tagging results can be utilized to enhance the efficiency and accuracy for high-level semantic decoding of human speech intention.
Audience Take Away Notes:
- Introduction of intuitive speech imagery decoding strategy
- Novel possibilities for understanding and interpreting EEG tokenization
- Necessity of bridging the gap between neural activity and semantic understanding