Introduction

Soma is creating personalized timelines for brain health. We empower individuals to enhance their cognitive performance or manage challenges with brain disorders. Our technology is enabled by state-of-the-art AI multimodal digital biomarkers, starting with the human voice.

Multimodal Audio Data for Brain Health

Everyday, individuals generate a large amount of “audiomic” information through basic communication [1]. This dataset contains rich biomarkers of health and wellness. Audiomic data consists of 3 main channels [2]:

Channel	Description	Health Example
Language	Word choice, grammar, and semantic content	Changes in vocabulary complexity could indicate cognitive decline associated with dementia
Speech	Rhythm, rate, and fluency of verbal expression	Slurred speech patterns may signal stroke symptoms
Voice	Acoustic properties like pitch, volume, and timbre	Changes in vocal tremor can indicate early signs of Parkinson's Disease

Scientific Foundation

Through brief assessments often involving simple exercises like reading simple sentences, audiomic data has facilitated AI-powered screening across multiple health domains:

Moreover, extensive scientific literature supports the use of “audiomic” AI models in neurological care applications, including diagnosis of neurodegenerative disorders, monitoring stroke rehabilitation progress, and ALS progression tracking [8-10]. Many of these AI models have been built within only one of the main categories of audio data - voice, speech, or language. For example, a manuscript in Scientific Reports introduced a new AI model that diagnosed Parkinson’s Disease by detecting changes in the stability of voice when the patient said the “ah” sound for an extended period of time [11]. In the speech domain, a recent publication in PlosOne trained an AI model to detect recorded telephone conversations to identify risk factors of Dementia [12]. AI methods have also been used extensively to train models for tasks involving the diagnosis of mental health conditions, including a pre-trained model for diagnosis of depression based on insights from the sound of the voice and a multimodal approach for evaluation of Schizophrenia [13-14].

Soma Technology

In the past, traditional AI models have been limited in their ability to process and understand longitudinal health data captured through voice. However, advanced AI technologies, particularly large language models, have enabled Soma to leverage language, speech, and voice data, in the creation of patient-specific generative AI systems around brain health, built over time.

This novel technology is centered around the concept of an audio diary. Soma users answer brief questions about their health via voice recordings and complete short voice/speech tasks like reading sentences or saying vowel sounds for 30-seconds. With this rich multimodal information, which has been shown to significantly outperform traditional data collection methods, Soma AI can perform a variety of tasks, including:

Comprehensive data interpretation and analysis
Identification of actionable trends and correlations
Training customized AI models to predict health events in advance
Personalized large language model that is connected directly to the brain health record and offers Q/A support, education, recommendations, and enhanced interactions with the healthcare system.