7 - Neuromagnetic Patterns Of Imagined and Overt Speech

Thursday, May 1, 2025

1:00 PM - 2:00 PM CST

Location: Stars at Night Ballroom 2-3 Foyer

Disclosure(s):

Keerthana Stanley: No financial relationships to disclose

Abstract: Understanding the neural patterns underlying speech is crucial to improve the efficacy of speech-assistance technology, such as speech brain-computer-interfaces (speech-BCI). This study analyzes neuromagnetic signals recorded via magnetoencephalography (MEG) from healthy adult English speakers during imagined and overt speech tasks. Brainwave activities across various bands (e.g., delta) were compared between the two speech tasks. Additionally, a support vector machine (SVM) classifier was applied to distinguish between the two tasks at the sample level. Results indicate that the differences between imagined and overt speech are significant in the delta band (p < 0.05). The machine learning classification yielded an accuracy of 91.88%, further indicating that the two tasks are distinguishable.

Description: Introduction

Speech production is a physiologically and neurologically complex process, allowing humans to transform otherwise random sounds into a highly effective means of communication (Dash et al., 2020). There are different modalities of speech: overt speech (speaking ‘out loud’, where the speaker both articulates and produces acoustical output), silent articulated speech (the speaker articulates words but without any acoustical output), and imagined speech (the speaker has no articulation or acoustical output, but rather, imagines words or phrases). Neurodegenerative conditions such as amyotrophic lateral sclerosis (ALS) can result in a ‘locked-in syndrome,’ where a patient’s cognitive functions are fully intact, but they experience a complete loss of motor function. This means that while articulated speech is no longer possible for these patients, they maintain the ability for complex language in the form of imagined speech. These patients rely on assistive speech technology such as a brain-computer-interface (BCI) in order to continue communication with others. However, while speech-BCI technology has made immense progress over the years, much of the current research has focused on decoding overt or silently articulated speech, rather than fully imagined speech (Willett et al., 2023). Therefore, a deeper understanding of imagined speech and how it differs from overt speech is crucial. This study investigates these differences in healthy English speakers using magnetoencephalography (MEG), a non-invasive neuroimaging technique, to explore the cognitive activity underlying these two speech modalities.

Methods

Six healthy English-speaking participants (4 males and 2 females; ages 33 to 68 years) took part in this study. Data was acquired at Dell Children’s Medical Center, Austin, TX, using an Elekta Neuromag Triux MEG system, equipped with 204 planar gradiometer sensors and 102 magnetometer sensors. The MEG machine was housed in a magnetically shielded room (MSR) to attenuate external interference and ensure accurate signal recordings.

Each trial/phrase of the experiment is composed of four task segments: baseline, perception, imagined speech, and overt speech. The protocol was as follows. Participants sat comfortably within the MSR while MEG data was recorded, facing a blank screen. Initially, no stimulus was presented, allowing for baseline recordings of brain activity. Next, a phrase (e.g., How are you?) appeared on the screen and participants were given time to silently view the phrase (the perception segment). Following this, the phrase was replaced by fixation crosses (+), and participants were instructed to imagine how they would say the phrase (the imagined speech segment). Finally, the fixation crosses disappeared, and participants were instructed to verbally articulate the phrase out loud, making the overt speech segment. 

Of the six speakers, four completed the protocol in which they repeated five unique phrases 100 times. The total length of the trial was approximately 4.5 seconds (0.5 seconds for baseline, 1 second for perception, 1 second for imagined speech, and 2 seconds for overt speech), resulting in 500 total trials per participant (Dash et al., 2020). The remaining two participants followed a protocol where they produced 400 unique, phonetically-balanced sentences (Kalikow et al., 1977). Each trial lasted about 16 seconds (4 seconds each for baseline, perception, imagined speech, and overt speech), yielding 400 total trials per participant.

The neuromagnetic signal recordings from the MEG gradiometer sensors during both imagined and articulated speech tasks were filtered and epoched using MNE-Python. The preliminary analysis was focused on using average band power as a measure of brain activity across the two speech tasks. The frequency bands analyzed were as follows: delta band (1 - 4 Hz), theta band (4 - 8 Hz), alpha band (8 - 13 Hz), beta band (13 - 30 Hz), gamma band (30 - 61 Hz), and high gamma band (61 - 119 Hz). 

Using MATLAB and the Fieldtrip toolbox (Oostenveld et al., 2011), the power for each frequency band was extracted and averaged across trials. The average band power over time, for both imagined and overt speech, was then visualized on a topographic map of the MEG sensors. Additionally, the difference between the overt speech and imagination band powers was calculated and presented on a separate topographic map.

Machine learning classification methods were applied to determine how distinguishable the two modalities (i.e., speech tasks) were based solely on magnetic signals. A support vector machine (SVM) classification model was trained on data from both imagined and overt speech. Then the SVM model was tested to classify overt speech and imagined speech trials. A 10-fold cross validation strategy was used, in which the dataset was divided into 10 partitions: one partition was used for testing while the remaining nine were used for training. This process was repeated across all folds, and the final accuracy was computed as the average across the 10 folds.

The significance of the band power differences between imagined and articulated speech was calculated using the Wilcoxon signed-rank test. After applying the Bonferroni multiple-comparison correction, significance for band power differences was set at p < 0.05. 

Results and Discussion

The preliminary results from four participants show significant differences in the average brainwave band powers between imagined and articulated speech. Delta band consistently showed the most significant differences between the two modalities (p < 0.05 across all 204 gradiometers), particularly in three participants. Additionally, preliminary results in one participant revealed that the SVM model achieved a classification accuracy of 91.88% in distinguishing between imagined and overt speech, in support of the band power findings. These results are promising for the improvement of speech-BCI technology, as they suggest that the characteristics of imagined speech can potentially be quantified and classified.

Further data analysis, including band power and machine learning classification for additional participants, is ongoing, with more data collection planned. We anticipate completing the data analysis before the conference.

Supporting Research: Reference 1: Dash, D., Ferrari, P., & Wang, J. (2020). Decoding imagined and spoken phrases from non-invasive neural (MEG) signals. Frontiers in Neuroscience, 14. https://doi.org/10.3389/fnins.2020.00290

Supporting Research: Reference 2: Kalikow, D. N., Stevens, K. N., & Elliott, L. L. (1977). Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability. The Journal of the Acoustical Society of America, 61(5), 1337–1351. https://doi.org/10.1121/1.381436

(this is the specific source for where I derived the 400 phonetically-balanced sentences from)

Supporting Research: Reference 3: Oostenveld, R., Fries, P., Maris, E., & Schoffelen, J.-M. (2011). FieldTrip: Open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Computational Intelligence and Neuroscience, 2011, 1–9. https://doi.org/10.1155/2011/156869

(this is the specific source for the Fieldtrip toolbox that I used to create topographic maps of the data)

Supporting Research: Reference 4: Willett, F. R., Kunz, E. M., Fan, C., Avansino, D. T., Wilson, G. H., Choi, E. Y., Kamdar, F., Glasser, M. F., Hochberg, L. R., Druckmann, S., Shenoy, K. V., & Henderson, J. M. (2023). A high-performance speech neuroprosthesis. Nature, 620(7976), 1031–1036. https://doi.org/10.1038/s41586-023-06377-x

Supporting Research: Reference 5:

Learning Objectives:

As a result of this presentation, the participant will be able to...Explain the gap in current speech-BCI research
As a result of this presentation, the participant will be able to...
Explain the cognitive processes underlying overt and imagined speech
As a result of this presentation, the participant will be able to...Identify the relevance of neuroimaging research (i.e., magnetoencephalography) in the improvement of speech-BCI technology

Poster Presenter(s)

Keerthana Stanley

PhD Student
The University of Texas at Austin
Austin, Texas

Disclosure(s): No financial relationships to disclose