Improving the Intelligibility of Dysarthric Speech
Contact
Members
Hemanth Venkateswara
Hemanth Venkateswara
Speech is a complicated process with many potential breakpoints. Speech that is less intelligible due to a neurological disorder is referred to as dysarthric speech. Individuals with dysarthric speech generally find it difficult to communicate with unfamiliar communication partners and are often not understood by automatic speech recognition (ASR) systems. The speech of individuals with dysarthria is highly variable—speech may be slurred; have nasal, strained, or hoarse vocal quality; and vary in tempo, rhythm, or volume of speech production. This wide breadth of symptoms, as well as datasets that are an order of magnitude smaller than standard speech corpuses, make recognizing and understanding dysarthric speech a challenging problem. Most of the research in the field of dysarthric speech recognition has been focused on creating a robust system to recognize dysarthric speech. We are taking a new perspective through looking at the problem of reduced intelligibility as a speech enhancement problem.
To improve the intelligibility of speech, we are using reinforcement learning, adversarial learning, and audiovisual methods. By enhancing the intelligibility of speech, individuals with dysarthric speech will be more able to be understood by both humans and machines. We are also tackling the problem of a lack of data by creating more reduced-intelligibility speech data. With more data, we hope to be able to build more robust systems for intelligibility enhancement.
Funding Sources
National Science Foundation Graduate Research Fellowship
Fulton Undergraduate Research Initiative