Title : Towards brain-computer interface based communication tool utilizing EEG signals integrated with large language model
Abstract:
Brain-computer interface (BCI) is a research field that involves analyzing and classifying signals from brain activity using machine learning or deep learning and applying them to various applications. Speech imagery decoding involves analyzing brain waves induced by the user’s imagined speech, without the user actual speaking, and applying them to communication tools. In traditional speech imagery decoding, brain activity related to a limited set of tasks was collected and classified. However, recent efforts aim to combine brain activity collected by methods such as functional magnetic resonance imaging (fMRI) and electrocorticography (ECoG) with a large language model (LLM) for sentence-level decoding. In this study, we aim to classify speech imagery brainwave data collected through electroencephalogram (EEG) to categorically determine the meaning conveyed by these data. Then, the classification results are combined with the language model to generate sentences based only on brain waves. The process of collecting EEG data is as follows: Participants freely generate sentences related to a given keyword (e.g., meals and cooking: “What should I eat for dinner?”). They then imagine speech in those sentences while the corresponding EEG is recorded. Collecting EEG data undergoes preprocessing, such as noise removal and selection of temporal lobe channels. It is then classified using deep learning classifiers such as EEGNet to determine which keyword corresponds to the imagined speech. The extracted keyword is then presented as a prompt to LLM such as generative pre-trained transformer (GPT) for sentence generation. The combination of EEG signals with LLM presents a cost-effective alternative to fMRI or ECoG, making it applicable to an intuitive BCI communication tool. This is an aspect that has not been explored in the speech imagery decoding paradigm. To achieve more accurate and specific keyword extraction from EEG, an effective deep learning architecture should be designed. The extracted keywords are then used to tune the LLM to generate sentences that closely align with the speaker’s intent, allowing for a higher level of speech imagery decoding.
Audience Take Away Notes:
- Introduction of new speech imagery decoding paradigm in BCI
- About benefits of high-level speech imagery decoding
- Necessity of integrating an LLM into the BCI communication