Assessment of the Performance of an AI-Driven SpeechEnhancer Algorithm for Speech Enhancement Following Laryngeal Oncosurgery

Pribuišis, Kipras; Maskeliūnas, Rytis; Ulozaitė - Stanienė, Nora; Padervinskis, Evaldas; Damaševičius, Robertas; Blažauskas, Tomas; Ulozas, Virgilijus

Use this url to cite publication: https://hdl.handle.net/20.500.12512/252613

Assessment of the Performance of an AI-Driven SpeechEnhancer Algorithm for Speech Enhancement Following Laryngeal Oncosurgery

Type of publication

Straipsnis Web of Science ir Scopus duomenų bazėje / Article in Web of Science and Scopus database (S1)

Author(s)

Author	Affiliation
Pribuišis, Kipras	Ausų, nosies ir gerklės ligų klinika (U523400)
Maskeliūnas, Rytis	Kauno technologijos universitetas
Ulozaitė - Stanienė, Nora	Ausų, nosies ir gerklės ligų klinika (U523400)
Padervinskis, Evaldas	Ausų, nosies ir gerklės ligų klinika (U523400)
Damaševičius, Robertas	Kauno technologijos universitetas
Blažauskas, Tomas	Kauno technologijos universitetas
Ulozas, Virgilijus	Ausų, nosies ir gerklės ligų klinika (U523400)

Title

Assessment of the Performance of an AI-Driven SpeechEnhancer Algorithm for Speech Enhancement Following Laryngeal Oncosurgery

Publisher (trusted)

Elsevier BV

Is Referenced by

Science Citation Index Expanded (Web of Science)

Scopus

PubMed

Date Issued

Date	Volume	Issue	Start Page	End Page
2025-05-21	00	00	1	13

Is part of

Journal of Voice

Version

Originalus / Original

Description

In press

Supplementary data associated with this article can be found in the online version at doi:10.1016/j.jvoice.2025.04.026.

Field of Science

Medicina / Medicine (...

Informatikos inžineri...

OECD Classification

Medical and Health sc...

Keywords (en)

Laryngeal cancer

Speech synthesis

Artificial intelligen...

Abstract (en)

Objective. The present study aimed to evaluate the effectiveness of the performance of an AI-driven SpeechEnhancer algorithm speech synthesis following laryngeal oncosurgery.

Methods. The original and synthesized speech samples from 77 patients after laryngeal oncosurgery were evaluated in this study. A panel of four experts conducted the auditory-perceptual speech evaluation using the IINFVo and the Similarity Mean Opinion Score (SMOS) scales. The acoustic analysis of speech samples was performed using the Average Voicing Evidence (AVE), Proportion of Voiced Frames (PVF), Proportion of Voiced Speech Frames (PVS) and Acoustic Substitution Voicing Index (ASVI) measures.

Results. The synthesized speech samples outperformed the original speech in acoustic and auditory-perceptual evaluation. The mean total IINFVo scores were statistically significantly higher (P < 0.05) in the synthesized speech samples group [IINFVo = 5.59 (SD = 0.83)] when compared with the original speech samples [IINFVo = 4.18 (SD = 1.11)]. The mean SMOS score of 2.42 (SD = 1.19) demonstrated a modest level of similarity between the synthesized and original speech samples. A statistically significant (P < 0.05) improvement of acoustic AVE, PVF, and PVS parameters in synthesized speech samples was observed. The quality of the synthesized speech [ASVI = 19.22 (SD = 7.44)] statistically significantly (P = 0.001) surpassed the original substitution voicing speech quality (ASVI = 9.39 (SD = 4.34).

Conclusion. The AI-driven "SpeechEnhancer" algorithm is a promising tool for speech rehabilitation after laryngeal oncosurgery. It demonstrates the potential for use in clinical settings by healthcare professionals and patients following laryngeal carcinoma surgery.