Tactile Emotions: Multimodal Affective Captioning with Haptics Improves Narrative Engagement for d/Deaf and Hard-of-Hearing ViewersThis paper explores a multimodal approach for translating emotional cues present in speech, designed with Deaf and Hard-of-Hearing (DHH) individuals in mind. Prior work has focused on visual cues applied to captions, successfully conveying whether a speaker's words have a negative or positive tone (valence), but with mixed results regarding the intensity (arousal) of these emotions. We propose a novel method using haptic feedback to communicate a speaker's arousal levels through vibrations on a wrist-worn device. In a formative study with 16 DHH participants, we tested six haptic patterns and found that participants preferred single per-word vibrations at 75 Hz to encode arousal. In a follow-up study with 27 DHH participants, this pattern was paired with visual cues, and narrative engagement with audio-visual content was measured. Results indicate that combining haptics with visuals significantly increased engagement compared to a conventional captioning baseline and a visuals-only affective captioning style.2025CPCaluã de Lacerda Pataca et al.Rochester Institute of Technology, Computing and Information SciencesVibrotactile Feedback & Skin StimulationDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)CHI
Caption Royale: Exploring the Design Space of Affective Captions from the Perspective of Deaf and Hard-of-Hearing IndividualsAffective captions employ visual typographic modulations to convey a speaker's emotions, improving speech accessibility for Deaf and Hard-of-Hearing (DHH) individuals. However, the most effective visual modulations for expressing emotions remain uncertain. Bridging this gap, we ran three studies with 39 DHH participants, exploring the design space of affective captions, which include parameters like text color, boldness, size, and so on. Study 1 assessed preferences for nine of these styles, each conveying either valence or arousal separately. Study 2 combined Study 1's top-performing styles and measured preferences for captions depicting both valence and arousal simultaneously. Participants outlined readability, minimal distraction, intuitiveness, and emotional clarity as key factors behind their choices. In Study 3, these factors and an emotion-recognition task were used to compare how Study 2's winning styles performed versus a non-styled baseline. Based on our findings, we present the two best-performing styles as design recommendations for applications employing affective captions.2024CPCaluã de Lacerda Pataca et al.Rochester Institute of TechnologyVoice AccessibilityDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Universal & Inclusive DesignCHI
Visualization of Speech Prosody and Emotion in Captions: Accessibility for Deaf and Hard-of-Hearing UsersSpeech is expressive in ways that caption text does not capture, with emotion or emphasis information not conveyed. We interviewed eight Deaf and Hard-of-Hearing (DHH) individuals to understand if and how captions' inexpressiveness impacts them in online meetings with hearing peers. Automatically captioned speech, we found, lacks affective depth, lending it a hard-to-parse ambiguity and general dullness. Interviewees regularly feel excluded, which some understand is an inherent quality of these types of meetings rather than a consequence of current caption text design. Next, we developed three novel captioning models that depicted, beyond words, features from prosody, emotions, and a mix of both. In an empirical study, 16 DHH participants compared these models with conventional captions. The emotion-based model outperformed traditional captions in depicting emotions and emphasis, with only a moderate loss in legibility, suggesting its potential as a more inclusive design for captions.2023CPCaluã de Lacerda Pataca et al.Rochester Institute of TechnologyVoice AccessibilityDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Universal & Inclusive DesignCHI
Methods for Evaluating the Fluency of Automatically Simplified Texts with Deaf and Hard-of-Hearing Adults at Various Literacy LevelsResearch has revealed benefits and interest among Deaf and Hard-of-Hearing (DHH) adults in reading-assistance tools powered by Automatic Text Simplification (ATS), a technology whose development benefits from evaluations by specific user groups. While prior work has provided guidance for evaluating text complexity among DHH adults, researchers lack guidance for evaluating the fluency of automatically simplified texts, which may contain errors from the simplification process. Thus, we conduct methodological research on the effectiveness of metrics (including reading speed; comprehension questions; and subjective judgements of understandability, readability, grammaticality, and system performance) for evaluating texts controlled to be at different levels of fluency, when measured among DHH participants at different literacy levels. Reading speed and grammaticality judgements effectively distinguished fluency levels among participants across literacy levels. Readability and understandability judgements, however, only worked among participants with higher literacy. Our findings provide methodological guidance for designing ATS evaluations with DHH participants.2022OAOliver Alonzo et al.Rochester Institute of TechnologyVoice AccessibilityDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Universal & Inclusive DesignCHI
Watch It, Don't Imagine It: Creating a Better Caption-Occlusion Metric by Collecting More Ecologically Valid Judgments from DHH ViewersTelevision captions blocking visual information causes dissatisfaction among Deaf and Hard of Hearing (DHH) viewers, yet existing caption evaluation metrics do not consider occlusion. To create such a metric, DHH participants in a recent study imagined how bad it would be if captions blocked various on-screen text or visual content. To gather more ecologically valid data for creating an improved metric, we asked 24 DHH participants to give subjective judgments of caption quality after actually watching videos, and a regression analysis revealed which on-screen contents’ occlusion related to users’ judgments. For several video genres, a metric based on our new dataset out-performed the prior state-of-the-art metric for predicting the severity of captions occluding content during videos, which had been based on that prior study. We contribute empirical findings for improving DHH viewers’ experience, guiding the placement of captions to minimize occlusions, and automated evaluation of captioning quality in television broadcasts.2022AAAkhter Al Amin et al.Rochester Institute of TechnologyVoice AccessibilityDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Universal & Inclusive DesignCHI
Analyzing Deaf and Hard-of-Hearing Users' Behavior, Usage, and Interaction with a Personal Assistant Device that Understands Sign-Language InputAs voice-based personal assistant technologies proliferate, e.g., smart speakers in homes, and more generally as voice-control of technology becomes increasingly ubiquitous, new accessibility barriers are emerging for many Deaf and Hard of Hearing (DHH) users. Progress in sign-language recognition may enable devices to respond to sign-language commands and potentially mitigate these barriers, but research is needed to understand how DHH users would interact with these devices and what commands they would issue. In this work, we directly engage with the DHH community, using a Wizard-of-Oz prototype that appears to understand American Sign Language (ASL) commands. Our analysis of video recordings of DHH participants revealed how they woke-up the device to initiate commands, structured commands in ASL, and responded to device errors, providing guidance to future designers and researchers. We share our dataset of over 1400 commands, which may be of interest to sign-language-recognition researchers.2022AGAbraham Glasser et al.Rochester Institute of TechnologyHaptic WearablesHand Gesture RecognitionVoice AccessibilityCHI
Remotely Co-Designing Features for Communication Applications using Automatic Captioning with Deaf and Hearing PairsDeaf and Hard-of-Hearing (DHH) users face accessibility challenges during in-person and remote meetings. While emerging use of applications incorporating automatic speech recognition (ASR) is promising, more user-interface and user-experience research is needed. While co-design methods could elucidate designs for such applications, COVID-19 has interrupted in-person research. This study describes a novel methodology for conducting online co-design workshops with 18 DHH and hearing participant pairs to investigate ASR-supported mobile and videoconferencing technologies along two design dimensions: Correcting errors in ASR output and implementing notification systems for influencing speaker behaviors. Our methodological findings include an analysis of communication modalities and strategies participants used, use of an online collaborative whiteboarding tool, and how participants reconciled differences in ideas. Finally, we present guidelines for researchers interested in online DHH co-design methodologies, enabling greater geographically diversity among study participants even beyond the current pandemic.2022MSMatthew Seita et al.Rochester Institute of TechnologyDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Prototyping & User TestingCHI
Design and Evaluation of Hybrid Search for American Sign Language to English Dictionaries: Making the Most of Imperfect Sign RecognitionSearching for the meaning of an unfamiliar sign-language word in a dictionary is difficult for learners, but emerging sign-recognition technology will soon enable users to search by submitting a video of themselves performing the word they recall. However, sign-recognition technology is imperfect, and users may need to search through a long list of possible results when seeking a desired result. To speed this search, we present a hybrid-search approach, in which users begin with a video-based query and then filter the search results by linguistic properties, e.g., handshape. We interviewed 32 ASL learners about their preferences for the content and appearance of the search-results page and filtering criteria. A between-subjects experiment with 20 ASL learners revealed that our hybrid search system outperformed a video-based search system along multiple satisfaction and performance metrics. Our findings provide guidance for designers of video-based sign-language dictionary search systems, with implications for other search scenarios.2022SHSaad Hassan et al.Rochester Institute of TechnologyConversational ChatbotsVoice AccessibilityCHI
Comparison of Methods for Evaluating Complexity of Simplified Texts among Deaf and Hard-of-Hearing Adults at Different Literacy LevelsResearch has explored using Automatic Text Simplification for reading assistance, with prior work identifying benefits and interestsfrom Deaf and Hard-of-Hearing (DHH) adults. While the evaluation of these technologies remains a crucial aspect of research inthe area, researchers lack guidance in terms of how to evaluate text complexity with DHH readers. Thus, in this work we conductmethodological research to evaluate metrics identified from prior work (including reading speed, comprehension questions, andsubjective judgements of understandability and readability) in terms of their effectiveness for evaluating texts modified to be atvarious complexity levels with DHH adults at different literacy levels. Subjective metrics and low-linguistic-complexity comprehensionquestions distinguished certain text complexity levels with participants with lower literacy. Among participants with higher literacy,only subjective judgements of text readability distinguished certain text complexity levels. For all metrics, participants with higherliteracy scored higher or provided more positive subjective judgements overall.2021OAOliver Alonzo et al.Rochester Institute of TechnologyVoice AccessibilityDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Cognitive Impairment & Neurodiversity (Autism, ADHD, Dyslexia)CHI
Automatic Text Simplification Tools for Deaf and Hard of Hearing Adults: Benefits of Lexical Simplification and Providing Users with AutonomyAutomatic Text Simplification (ATS), which replaces text with simpler equivalents, is rapidly improving. While some research has examined ATS reading-assistance tools, little has examined preferences of adults who are deaf or hard-of-hearing (DHH), and none empirically evaluated lexical simplification technology (replacement of individual words) with these users. Prior research has revealed that U.S. DHH adults have lower reading literacy on average than their hearing peers, with unique characteristics to their literacy profile. We investigate whether DHH adults perceive a benefit from lexical simplification applied automatically or when users are provided with greater autonomy, with on-demand control and visibility as to which words are replaced. Formative interviews guided the design of an experimental study, in which DHH participants read English texts in their original form and with lexical simplification applied automatically or on-demand. Participants indicated that they perceived a benefit form lexical simplification, and they preferred a system with on-demand simplification.2020OAOliver Alonzo et al.Rochester Institute of TechnologyVoice AccessibilityUniversal & Inclusive DesignCHI
Methods for Evaluation of Imperfect Captioning Tools by Deaf or Hard-of-Hearing Users at Different Reading Literacy LevelsAs Automatic Speech Recognition (ASR) improves in accuracy, it may become useful for transcribing spoken text in real-time for Deaf and Hard-of-Hearing (DHH) individuals. To quantify users' comprehension and opinion of automatic captions, which inevitably contain some errors, we must identify appropriate methodologies for evaluation studies with DHH users, including quantitative measurement instruments suitable to the various literacy levels among the DHH population. A literature review guided our selection of several probes (e.g. multiple-choice comprehension-question accuracy or response time, scalar-questions about user estimation of ASR errors or their impact, users' numerical estimation of accuracy), which we evaluated in a lab study with DHH users, wherein their literacy levels and the actual accuracy of each caption stimulus were factors. For some probes, participants with lower literacy had more positive subjective responses overall, and, for participants with particular literacy score ranges, some probes were insufficiently sensitive to distinguish between caption accuracy levels.2018LBLarwan Berke et al.Rochester Institute of TechnologyVoice AccessibilityDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)CHI