Abstract
In the clinical practice of dysphonia, the effects of treatment are traditionally monitored by a sequence of auditory-perceptual assessments aimed at measuring vocal quality for the patient. Alternatively, acoustic measurement of vocal quality promises to automate perceptual assessments while keeping the assessments accurate and non-invasive. However, acoustic measures of vocal quality need to be further developed in both functional and technical terms. On the one hand, many of them are susceptible to non-dysphonic perturbations from articulatory movements in continuous speech, while on the other, their accuracy in approximating the generally nonlinear mapping from observation to vocal quality is limited by their use of a linear model. This paper presents an acoustic measure of vocal strain, a specific vocal quality that typically co-occurs with the development of vocal-fold nodules in vocal hyper-function. Vocal strain merits acoustic measurement more than other vocal qualities because its perceptual assessment typically exhibits a lower intra- and inter-rater reliability than the assessment of other vocal qualities. Based on an assumed correlation between vocal strain and the degree of periodicity in vocal-fold vibrations, this paper presents an acoustic measure in which a nonlinear regression model is used to predict the strain from some periodicity features extracted from a glottal airflow estimate. When tested on a set of listener-rated utterances composed mostly of continuous speech, the proposed glottal measure outperformed a direct-analysis measure in producing strain assessments which are consistent with perceptual ratings.
| Original language | English |
|---|---|
| Article number | 9291477 |
| Pages (from-to) | 563-574 |
| Number of pages | 12 |
| Journal | IEEE/ACM Transactions on Audio Speech and Language Processing |
| Volume | 29 |
| DOIs | |
| Publication status | Published - 11 Dec 2020 |
Bibliographical note
Funding Information: Manuscript received January 18, 2020; revised August 14, 2020 and November 2, 2020; accepted December 1, 2020. Date of publication December 11, 2020; date of current version January 6, 2021. This work was supported in part by the Icelandic Centre for Research under Grant 152705-051. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Isabel Barbancho. (Corresponding author: Yu-Ren Chien.) The authors are with the Center for Analysis and Design of Intelligent Agents, Reykjavik University, 101 Reykjavík, Iceland (e-mail: [email protected]; [email protected]). Digital Object Identifier 10.1109/TASLP.2020.3044168 Publisher Copyright: © 2014 IEEE.Other keywords
- Vocal strain
- glottal airflow estimation
- glottal inverse filtering
- objective assessment
- vocal strain