You have accessAmerican Journal of Speech-Language PathologyResearch Article1 Aug 1997

The Voice Handicap Index (VHI)

Development and Validation

    To date, no instruments exist to quantify the psychosocial consequences of voice disorders. The aim of the present investigation was the development of a statistically robust Voice Handicap Index (VHI). An 85-item version of this instrument was administered to 65 consecutive patients seen in the Voice Clinic at Henry Ford Hospital. The data were subjected to measures of internal consistency reliability and the initial 85-item version was reduced to a 30-item final version. This final version was administered to 63 consecutive patients on two occasions in an attempt to assess test-retest stability, which proved to be strong. The findings of the latter analysis demonstrated that a change between two administrations of 18 points represents a significant shift in psychosocial function.

    Measuring the severity of a voice disorder is difficult. Methods have ranged from subjective measures of voice disorder severity including perceptual judgments (e.g., grading of voice quality as mild, moderate, or severe) to objective measures of voice characteristics (e.g., videostroboscopic findings and physiological measure of voice compared to normative data). Although these methods can yield valuable data, they do not provide insight into why patients with similar voice disorders experience differing levels of handicap and disability. For example, a retired man who lives alone and has few social contacts may view his voice disorder (resulting from unilateral vocal fold paralysis) as less handicapping than a salesperson who has daily contact with customers and two small children at home.

    The terms “disability” and “handicap” have specific definitions. For example, the World Health Organization (1980) defines disability as “a restriction or lack of ability manifested in the performance of daily tasks.” In this connection, voice disability would be the inability to produce a high pitch or to speak loudly. Handicap is defined as, “a social, economic, or environmental disadvantage resulting from an impairment or disability.” Thus, a voice handicap might occur when a person changes jobs because he cannot give presentations as required in his present position due to vocal fatigue.

    Self-perceived disability/handicap measures are ubiquitous in the field of audiology. That is, several disability/handicap measures have been developed for the evaluation of the communicative and psychosocial impact of hearing loss, dizziness, and tinnitus (Jacobson & Newman, 1990; Newman, Jacobson, & Spitzer, 1995; Newman, Jacobson, Weinstein, & Hug, 1990; Newman, Weinstein, Jacobson, & Hug, 1991; Ventry & Weinstein, 1982). These latter measures have been found to be useful for quantifying functional outcome following medical surgical or rehabilitative interventions.

    There are few standardized methods for assessing the psychosocial consequences of voice disorders. Llewellyn-Thomas et al. (1984) developed a linear analogue scale that was an attempt to quantify self-assessment of voice quality and daily functioning for patients with laryngeal cancer. Thirty-four patients who were undergoing radiation treatment rated the severity of their voice symptoms and their ability to communicate in various situations along a 10-cm line. One end of each item’s linear scale was anchored by a statement that indicated no impairment and the other end anchored by a statement that indicated severe impairment. For example, the item “How satisfactory has your ability to use your voice in work/leisure/social-related activities been over the past week?” was anchored by “Entirely satisfactory, i.e., able to carry out all speech-related activities without apparent effort” and “Absolutely unsatisfactory, i.e., all speech-related activities are impossible.” Test-retest reliability for the symptom and function scales ranged from r = 0.56 to r = 0.93. Although this scale was designed for use with a select group of patients, it represents the first attempt to produce a statistically validated instrument for assessment of the functional impact of alteration in voice quality.

    Smith et al. (1994) designed a questionnaire to elicit information from patients regarding the functional impact of voice disorders in various aspects of their lives, the effects of vocal symptoms specifically on employment, symptoms, risk factors, and family history. Data were collected from 113 patients. In an initial analysis, work-related effects for patients with voice disorders were apparent, as were effects on social interaction reported by older patients. This was the first study to evaluate the impact of voice disorders on quality of life dimensions and provided direction and impetus for further study.

    Although there has been acknowledgment in the literature that voice disorders can have a devastating impact on daily functioning and quality of life, there are few instruments that have been developed specifically to address this issue. Accordingly, the purpose of the present investigation was to develop a psychometrically robust voice disability/handicap inventory that could be used with patients exhibiting a variety of voice disorders.

    Investigation 1: Scale Development



    Sixty-five consecutive adult patients seen in the Voice Clinic at Henry Ford Hospital completed the preliminary version of the Voice Handicap Inventory (VHI) (mean age = 52.3; SD = 16.28; 25 males and 40 females). The subjects were diagnosed with a broad range of voice disorders (Table 1). They were classified into 6 groups based on diagnoses made jointly by otolaryngolo-gists and speech-language pathologists. Subjects in the mass lesions group had diagnoses such as vocal nodules, vocal polyps, and vocal cysts. Subjects in the neurogenic group had diagnoses such as vocal fold paralysis and spasmodic dysphonia. Subjects in the musculoskeletal tension group had demonstrated normal-appearing larynges but had significant laryngeal area muscle tension without any clear psychological overlay. Subjects in the inflammatory group had acute erythema of the vocal folds. Subjects in the atypical group had normal-appearing larynges and a clear psychogenic etiology with sudden onset for their dysphonias.

    TABLE 1. Diagnoses for patients participating in the development of the VHI.

    Diagnosis Number of Patients (%)
    Mass lesions 21 (32)
    Neurogenic 17 (26)
    Laryngectomy 17 (26)
    Musculoskeletal tension 5 (8)
    Inflammatory 3 (5)
    Atypical 2 (3)


    An initial pool of 85 items was developed empirically from case history interviews with patients with voice disorder seen in the Voice Clinic over the past 7 years. Patients seen in the Clinic have been diagnosed with a variety of voice disorders, ranging from benign mass lesions, vocal fold paralysis, spasmodic dysphonia, papilloma, musculoskeletal tension disorders, and atypical (“psychogenic”) voice disorders. Some items were created by rewording a previous item so that the meaning was similar. This was done to ensure that items were as clear as possible in the final version. Items were grouped a priori into three content domains representing functional (25 items), emotional (31 items), and physical (29 items) aspects of voice disorders. The 85 items comprising this preliminary version of the VHI were selected from patients’ reports to ensure that the scale had both content and face validity. The functional subscale included statements that described the impact of a person’s voice disorder on his or her daily activities. An example of a probe item from the functional content domain was, “My voice problem causes me to miss work.” The emotional subscale consisted of statements representing a patient’s affective responses to a voice disorder. An example of a probe item from the emotional content domain was, “I feel annoyed when people ask me to repeat.” Items comprising the physical subscale were statements representing self-perceptions of laryngeal discomfort and voice output characteristics (e.g., voice pitched too low or high). An example of a probe item from the physical content domain was, “I feel as though I have to strain to produce voice.” Subjects were asked to read each item and circle one of five responses comprising an equal-appearing five-point scale. The scale had the words “never” and “always” anchoring each end and the words “almost never”, “sometimes”, and “almost always” appearing in between. An “always” response was scored 4 points, a “never” response was scored 0, and the remaining options were scored between 1 and 3 points.

    The instructions to the patients were as follows: “These are statements that many people have used to describe their voices and the effects of their voices on their lives. Circle the response that indicates how frequently you have the same experience.”


    The internal consistency reliability of the preliminary version of the VHI was evaluated using Cronbach’s alpha coefficient. Items contained within a scale that have high item-total correlations contribute to the scale’s overall reliability and are more representative of scale content than items with low item-total correlations. The item total correlations ranged from r = 0.17 to r = 0.86. Nunnally (1978) has suggested that Cronbach’s alpha coefficient should be at least r = 0.50 for a single item to demonstrate acceptable internal consistency. Accordingly, the preliminary version was reduced from 85 items to 57 items by eliminating all items with item-total correlations of r < 0.60. We retained an additional four items with item-total correlation coefficients below r = 0.60 because they were judged by the authors to have high face validity. These items were: “The sound of my voice varies throughout the day,” “My voice sounds creaky and dry,” “My voice problem causes me to lose income,” and “I run out of air when I talk.”

    Fifteen more items were eliminated because the observed frequencies of “positive” (i.e., score of “2” to “4”) or “negative” (i.e., score of “0” or “1”) scores differed substantially between men and women. Responses to these items appeared to reflect a dependency on the sex of the patient; therefore, they were removed.

    An additional 16 items were eliminated because they were answered “never” by 50% or more of the subjects or because the item content was found to be redundant (that is, the subject matter was present in similarly worded questions with stronger item-total correlations).

    Through these procedures, the original 85-item preliminary version of the VHI was reduced to a 30-item (120-point total) final version (see Appendix). The final version consisted of a 10-item functional subscale, a 10-item emotional subscale, and a 10-item physical subscale.

    Investigation 2: Test-Retest Reliability



    Sixty-three of the adult patients who participated in the first investigation served as subjects for this investigation (mean age = 49 years, SD = 18 years; 25 males, 38 females). Only two of the original subjects were lost to follow-up, so the diagnosis mix remained roughly the same.


    The final version of the VHI was administered to subjects on two occasions. The amount of time between administrations ranged from 6 to 71 days (M = 29.3 days, SD = 29.26). A pencil and paper format was used. During this time, subjects did not undergo any intervening medical, surgical, or behavioral treatment. The subjects were given the same instructions as in Investigation 1.


    A Pearson product-moment correlation coefficient was used to determine the test-retest stability of the VHI subscales and total score. Test-retest stability for subscale and total scores was found to be strong for the functional (r = 0.84), emotional (r = 0.92), physical (r = 0.86) subscales, and total score (r = 0.92). From this data set, the 95% confidence intervals (critical difference scores) were derived for the functional, emotional, and physical subscales (8 points each) and for the VHI total scale score (18 points). Thus, a shift in the total score of 18 points or greater is required in order to be certain that the change is not due to unexplained variability inherent in the VHI.

    Cronbach’s alpha coefficient was calculated for the item-total correlation of the final version of the VHI. The resulting alpha coefficient (r = 0.95) represented little change from that calculated for the 85-item version of the VHI (r = 0.97).

    Finally, the magnitude of the relationship between the subscales was assessed with data collected from the first administration of the VHI. (The results are shown in Table 2). The relationships between the functional, physical, and emotional subscales of the VHI were moderate-strong with Pearson product-moment correlations ranging from r = 0.70 to r = 0.79.

    TABLE 2. Correlation matrix for total score and subscale scores for the final version of the Voice Handicap Index (VHI).

    Subscale Functional Physical Emotional Total
    Functional * 0.70 0.79 0.91
    Physical * * 0.72 0.88
    Emotional * * * 0.93
    Total * * * *

    Investigation 3: Relationship of VHI Score to Voice Disorder Severity



    Subjects used for this investigation were the same as those in Investigation 2 (Table 3).

    TABLE 3. Demographics of 63 subjects participating in Investigation 3.

    Mild (n = 13) Moderate (n = 27) Severe (n = 23)
    Age (years; mean, (SD)) 43.31 (4.70) 55.59 (3.26) 43 (3.53)
    Sex (male/female) 6/7 12/15 9/14


    In addition to completing the VHI, subjects were asked to self-rate the severity of their voice disorder on a 0 to 3 point scale with “0” representing a self-perception of voice as normal, “1” representing mildly impaired voice, “2” representing moderately impaired voice, and a “3” representing self-perception of voice as severe. Instructions to the subjects were as follows: “Please rate, in your opinion, how severe your voice problem is on this scale.” No specific instruction was given to the patient regarding the meaning of “severity.” It was predicted that self-perceived voice handicap would increase systematically with the degree of self-perceived voice abnormality.


    Subjects rating their voices as “normal” or “mild” were grouped together. Mean VHI total and subscale scores are shown in Table 4 as a function of severity groupings. A Pearson product-moment correlation coefficient was used to compare subjects’ VHI scores and judgments of severity. Results indicated a moderate relationship between the two patient self-assessment measures (r = 0.60).

    TABLE 4. Mean values (SD) for VHI subscale and total scale scores as a function of self-perceived voice severity.

    Scale Mild Moderate Severe
    Functional 10.07 (1.99) 12.41 (1.38) 18.30 (1.50)
    Physical 15.54 (1.97) 18.63 (1.37) 22.78 (1.48)
    Emotional 8.08 (2.31) 13.33 (1.61) 20.30 (1.74)
    Total 33.69 (5.60) 44.37 (3.88) 61.39 (4.21)


    The aim of the present investigation was to develop a psychometrically validated tool for measuring the psychosocial handicapping effects of voice disorders. We found the VHI to demonstrate strong internal consistency reliability and test-retest stability. A 95% confidence interval of 18 points was established that gives users of this scale assurance that changes in total scores between administrations are not due to inherent variability in the VHI. The VHI was developed using a diverse sample of patients with voice disorders, representing the breadth of pathology in most clinical settings. This was intentional, as we wanted to create a scale that could be generalized to other clinics and would have widespread application.

    Construct validity was not fully evaluated in this study, although the relationship between patient self-perceived severity and VHI scores was determined to be moderately strong. This aspect of instrument design is crucial to establishing the overall validity of any scale. Since there are no comparable scales to cross-validate construct validity for the VHI, one approach might be to administer the VHI to a group of persons without voice disorders. In this way, the presence of handicap and disability as represented by VHI scores can be confirmed.

    Several interesting observations were made during the course of this investigation. Patients mentioned frequently that they were unaware of the degree of severity of their voice problems until completing the VHI. Thus, measurement of handicap can have significant implications for the educational component of the treatment process. A crucial element in a patient’s ability to change behavior is motivation. When patients understand the implications of their voice problems in the context of daily living and functioning, they may be more likely to work toward changing factors that contribute to the development of their dysphonias.

    The VHI has several potential uses in the clinical practice of speech-language pathology. In the most basic application, the VHI can be used to assess the patient’s judgment about the relative impact of his or her voice disorder upon daily activities. In several instances, we have been surprised at how patients have scored on the VHI relative to our judgments of the severity of their voice disorders. The VHI can be of use in evaluating the effectiveness of specific voice treatment techniques such as Vocal Function Exercises or the Accent Method (Stemple, Glaze, & Gerdeman, 1995; Kotby, 1995). Data obtained from the VHI also can be used as a continuing quality measure for accreditation processes (e.g., Joint Commission for the Accreditation of Health Organizations). Finally, the VHI can be useful as a component of measuring functional outcomes in behavioral, medical, and surgical treatments of voice disorders. We already use well-established physiologic and perceptual measures toward this purpose and the addition of a patient self-assessment measure will strengthen our conclusions about the effectiveness and efficiency of various interventions for voice disorders.


    • Jacobson, G. P., & Newman, C. W. (1990). The development of the Dizziness Handicap Inventory (DHI).Archives of Otolaryngology–Head and Neck Surgery. 116, 424–427.
    • Kotby, M. K. (1995). The accent method of voice therapy. San Diego: Singular Publishing Group.
    • Llewellyn-Thomas, H. A., Sutherland, H. J., Hogg, S. A., Ciampi, A., Harwood, A., Keane, T., Till, J. E., & Boyd, N. F. (1984). Linear analogue self-assessment of voice quality in laryngeal cancer.Journal of Chronic Disease, 37, 917–924.
    • Newman, C. W., Jacobson, G. P., & Spitzer, J. (1995). Development of the Tinnitus Handicap Inventory.Archives of Otolaryngology–Head and Neck Surgery, 122, 143–148.
    • Newman, C. W., Jacobson, G. P., Weinstein, B. E., & Hug, G. A. (1990). The hearing handicap inventory for adults: Psychometric adequacy and audiometric correlates.Ear and Hearing, 11, 430–433.
    • Newman, C. W., Weinstein, B. E., Jacobson, G. P., & Hug, G. A. (1991). Test-retest reliability of the hearing handicap inventory for adults.Ear and Hearing, 12, 355–357.
    • Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.
    • Smith, E., Nichols, S., Lemke, J., Verdolini, K., Gray, S. D., Barkmeier, J., Dove, H., & Hoffman, H. (1994). Effects of voice disorders on patient lifestyle: Preliminary results.NCVS Status and Progress Report, 4, 237–248.
    • Stemple, J. C., Glaze, L. E., & Gerdeman, B. K. (1995). Clinical voice pathology (2nd ed.). San Diego: Singular.
    • Ventry, I., & Weinstein, B. (1982). The Hearing Handicap Inventory for the Elderly: A new tool.Ear and Hearing, 3, 128–134.
    • World Health Organization. (1980). International classification of impairments, disabilities, and handicaps. Geneva: World Health Organization.

    Appendix: Voice Handicap Index (VHI), Henry Ford Hospital

    Instructions: These are statements that many people have used to describe their voices and the effects of their voices on their lives. Circle the response that indicates how frequently you have the same experience.

    • F1. My voice makes it difficult for people to hear me.

    • P2. I run out of air when I talk.

    • F3. People have difficulty understanding me in a noisy room.

    • P4. The sound of my voice varies throughout the day.

    • F5. My family has difficulty hearing me when I call them throughout the house.

    • F6. I use the phone less often than I would like.

    • E7. I’m tense when talking with others because of my voice.

    • F8. I tend to avoid groups of people because of my voice.

    • E9. People seem irritated with my voice.

    • P10. People ask, “What’s wrong with your voice?”

    • F11. I speak with friends, neighbors, or relatives less often because of my voice.

    • F12. People ask me to repeat myself when speaking face-to-face.

    • P13. My voice sounds creaky and dry.

    • P14. I feel as though I have to strain to produce voice.

    • E15. I find other people don’t understand my voice problem.

    • F16. My voice difficulties restrict my personal and social life.

    • P17. The clarity of my voice is unpredictable.

    • P18. I try to change my voice to sound different.

    • F19. I feel left out of conversations because of my voice.

    • P20. I use a great deal of effort to speak.

    • P21. My voice is worse in the evening.

    • F22. My voice problem causes me to lose income.

    • E23. My voice problem upsets me.

    • E24. I am less outgoing because of my voice problem.

    • E25. My voice makes me feel handicapped.

    • P26. My voice “gives out” on me in the middle of speaking.

    • E27. I feel annoyed when people ask me to repeat.

    • E28. I feel embarrassed when people ask me to repeat.

    • E29. My voice makes me feel incompetent.

    • E30. I’m ashamed of my voice problem.

    Note. The letter preceding each item number corresponds to the subscale (E = emotional subscale, F = functional subscale, P = physical subscale).

    Author Notes

    Contact author: Barbara H. Jacobson, PhD, Division of Speech-Language Sciences and Disorders, Department of Neurology, Henry Ford Hospital, 2799 W. Grand Blvd, Detroit, Michigan 48202.

    Additional Resources