Voice-Evoked Color Prediction Using Deep Neural Networks in Sound–Color Synesthesia
Author | Affiliation | ||
---|---|---|---|
Bartulienė, Raminta | |||
Date | Volume | Issue | Start Page | End Page |
---|---|---|---|---|
2025-05-19 | 15 | 5 | 1 | 22 |
Article No. 520
This article belongs to the Special Issue Perceptual Learning and Cortical Plasticity
Background/Objectives: Synesthesia is an unusual neurological condition when stimulation of one sensory modality automatically triggers an additional sensory sensation in an additional unstimulated modality. In this study, we investigated a case of sound–color synesthesia in a female with impaired vision. After confirming a positive case of synesthesia, we aimed to determine the sound features that played a key role in the subject’s sound perception and color development. Methods: We applied deep neural networks and a benchmark of binary logistic regression to classify blue and pink synesthetically voice-evoked color classes using 136 voice features extracted from eight study participants’ voice recordings. Results: The minimum Redundancy Maximum Relevance algorithm was applied to select the 20 most relevant voice features. The recognition accuracy of 0.81 was already achieved using five features, and the best results were obtained utilizing the seventeen most informative features. The deep neural network classified previously unseen voice recordings with 0.84 accuracy, 0.81 specificity, 0.86 sensitivity, and 0.85 and 0.81 F1-scores for blue and pink classes, respectively. The machine learning algorithms revealed that voice parameters, such as Mel-frequency cepstral coefficients, Chroma vectors, and sound energy, play the most significant role. Conclusions: Our results suggest that a person’s voice’s pitch, tone, and energy affect different color perceptions.