µCap: Instrumental Music Captions for Deaf and Hard-of-Hearing IndividualsInstrumental music conveys rich affective experiences through acoustic cues, yet instrumental passages often remain inaccessible to Deaf and Hard-of-Hearing (DHH) audiences. Although captioning practices for vocal songs have expanded, instrumental music remains largely uncaptioned, with no established criteria for representing musical content in text. We propose 𝜇Cap (Music Captions), an automatic instrumental music captioning system that transforms instrumental audio into time-aligned, non-lexical textual renderings enhanced with simple visuals. Drawing on Preliminary surveys with DHH individuals and expert group discussions, we developed a phonetic-like captioning schema grounded in music sound analysis and linguistics. We then implemented 𝜇Cap using audio feature extraction and a Retrieval-Augmented Generation(RAG) pipeline to produce expressive, sound-mimetic captions. Two user evaluations with DHH participants (n=20 and n=15) showed that 𝜇Cap enhanced music appreciation, immersion, and perceived presence of acoustic detail. This work contributes empirical evidence and insights for designing caption-based visual representations that make instrumental music more accessible.2026SASooYeon Ahn et al.Gwangju Institute of Science and TechnologyDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Audio Accessibility (Captions, Sign Language, Vibration)CHI
From Daily Song to Daily Self: Supporting Emotional Growth of Deaf and Hard-of-Hearing Individuals through Generative AI SongwritingThe rapid advancement of generative AI (GenAI) is expanding access to songwriting, offering a new medium of self-expression for Deaf and Hard-of-Hearing (DHH) individuals. However, emerging technologies that support DHH individuals in expressing themselves through music have largely been evaluated in single-session settings and often fall short in helping users unfamiliar with songwriting convey personal narratives or sustain engagement over time. This paper explores songwriting as an extended, music-based journaling practice that supports sustained emotional reflection over multiple sessions. We introduce SoulNote, a GenAI system enabling DHH to engage in iterative songwriting. Grounded in user-centered design, including a design workshop, a preliminary study, and a multi-session diary study, our findings show that ongoing songwriting with SoulNote facilitated emotional growth across three dimensions: self-insight, emotion regulation, and everyday attitudes toward emotions and self-care. Overall, this work demonstrates how GenAI can support marginalized communities by transforming creative expression into a daily practice of self-discovery and reflection.2026YCYoujin Choi et al.Gwangju Institute of Science and TechnologyGenerative AI (Text, Image, Music, Video)Affective Feedback & Emotion Regulation InterfacesAffective Human-Computer DialogueCHI
Understanding Gaze-Based Identification in VR Through Preattentive Processing and Binocular RivalryStimulus-evoked gaze dynamics offer a secure and hands-free signal in virtual reality (VR), yet the underlying design space of effective visual stimuli remains poorly understood. This work examines how preattentive processing and binocular rivalry can inform stimulus design for gaze-based identification in VR. We conducted a two-part study: (1) a feasibility assessment of closed-set identification performance with 26 participants and 44,928 gaze samples collected by using a commercial headset (Meta Quest Pro), and (2) a usability study with 16 participants comparing the same interaction in a login context to PIN and out-of-band methods as a potential authentication technique. Our findings confirm the feasibility of personal identification, highlight usability advantages, and reveal participants’ desire for greater transparency to understand individual variations in login results. Together, these results offer conceptual insights into the perceptual mechanisms shaping stimulus-evoked gaze behavior, and outline design implications for future VR authentication workflows.2026JJJunryeol Jeon et al.Gwangju Institute of Science and TechnologyEye Tracking & Gaze InteractionVoice User Interface (VUI) DesignPasswords & AuthenticationCHI
Designing a Generative AI-Assisted Music Psychotherapy Tool for Deaf and Hard-of-Hearing IndividualsSongwriting has long served as a powerful medium for expressing unconscious emotions and fostering self-awareness in psychotherapy. Due to the auditory-centric nature of traditional approaches, Deaf and Hard-of-Hearing (DHH) individuals have often been excluded from music’s therapeutic benefits. In response, this study presents a music psychotherapy tool co-designed with therapists, integrating conversational agents (CAs) and music generative AI as symbolic and therapeutic media. Through a usage study with 23 DHH individuals, we found that collaborative songwriting with the CA enabled them to experience emotional release, re-interpretation, and deeper self-understanding. In particular, the CA’s strategies—supportive empathy, example response options, and visual-based metaphors—were found to facilitate musical dialogue effectively for DHH individuals. These findings contribute to inclusive AI design by showing the potential of human–AI collaboration to bridge therapeutic and artistic practices.2026YCYoujin Choi et al.Gwangju Institute of Science and TechnologyGenerative AI (Text, Image, Music, Video)Intelligent Voice Assistants (Alexa, Siri, etc.)Deaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)CHI
HumanoidTurk: Expanding VR Haptics with Humanoids for Driving SimulationsWe explore how humanoid robots can be repurposed as haptic media, extending beyond their conventional role as social, assistive, collaborative agents. To illustrate this approach, we implemented HumanoidTurk, taking a first step toward a humanoid-based haptic system that translates in-game g-force signals into synchronized motion feedback in VR driving. A pilot study involving six participants compared two synthesis methods, leading us to adopt a filter-based approach for smoother and more realistic feedback. A subsequent study with sixteen participants evaluated four conditions: no-feedback, controller, humanoid+controller, and human+controller. Results showed that humanoid feedback enhanced immersion, realism, and enjoyment, while introducing moderate costs in terms of comfort and simulation sickness. Interviews further highlighted the robot’s consistency and predictability in contrast to the adaptability of human feedback. From these findings, we identify fidelity, adaptability, and versatility as emerging themes, positioning humanoids as a distinct haptic modality for immersive VR.2026DLDaeHo Lee et al.Gwangju Institute of Science and TechnologyIn-Vehicle Haptic, Audio & Multimodal FeedbackImmersion & Presence ResearchRobots in Education & HealthcareCHI
Guaranteeing Equitable Musical Collaboration: Lessons Learned from the Music-Making Activities in Mixed-Hearing GroupsIntegrating mixed-hearing groups in musical collaboration presents unique challenges and opportunities for their communication and equal contribution. This observational study aims to explore their collaborative work, focusing on the way for equitable music-making. We observed two music-making workshops to identify the potential and dynamics of their musical collaboration. While the first workshop proceeded in a traditional manner of music-making, the second workshop used an assistive tool with multimodality. Our findings highlight the dynamics in musical collaboration that foster engagement and bridge interaction gaps. In turn, sensory inclusion with multimodal music-making promoted role transition in mixed-hearing groups and their equal contributions, leading to the embracing of diverse cultural perspectives. Based on the insights derived from the observations, we propose a design guideline and future research directions for harnessing group dynamics and building equitable musical collaborations for an inclusive environment for mixed-hearing groups.2025CLChungHa Lee et al.Deaf and Hard-of-Hearing ResearchCSCW
OnomaCap: Making Non-speech Sound Captions Accessible and Enjoyable through Onomatopoeic Sound RepresentationNon-speech sounds play an important role in setting the mood of a video and aiding comprehension. However, current non-speech sound captioning practices focus primarily on sound categories, which fails to provide a rich sound experience for d/Deaf and hard-of-hearing (DHH) viewers. Onomatopoeia, which succinctly captures expressive sound information, offers a potential solution but remains underutilized in non-speech sound captioning. This paper investigates how onomatopoeia benefits DHH audiences in non-speech sound captioning. We collected 7,962 sound-onomatopoeia pairs from listeners and developed a sound-onomatopoeia model that automatically transcribes sounds into onomatopoeic descriptions indistinguishable from human-generated ones. A user evaluation of 25 DHH participants using the model-generated onomatopoeia demonstrated that onomatopoeia significantly improved their video viewing experience. Participants most favored captions with onomatopoeia and category, and expressed a desire to see such captions across genres. We discuss the benefits and challenges of using onomatopoeia in non-speech sound captions, offering insights for future practices.2025JKJooYeong Kim et al.Gwangju Institute of Science and Technology, School of Integrated Technology/Soft Computing & Interaction LaboratoryVoice AccessibilityDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Universal & Inclusive DesignCHI
MVPrompt: Building Music-Visual Prompts for AI Artists to Craft Music Video Mise-en-scèneMusic videos have traditionally been the domain of experts, but with text-to-video generative AI models, AI artists can now create them more easily. However, accurately reflecting the desired music-visual mise-en-scène remains challenging without specialized knowledge, highlighting the need for supportive tools. To address this, we conducted a design workshop with seven music video experts, identified design goals, and developed MVPrompt—a tool for generating music-visual mise-en-scène prompts. In a user study with 24 AI artists, MVPrompt outperformed the Baseline, effectively supporting the collaborative creative process. Specifically, the Visual Theme stage facilitated the exploration of tone and manner, while the Visual Scene & Grammar stage refined prompts with detailed mise-en-scène elements. By enabling AI artists to specify mise-en-scène creatively, MVPrompt enhances the experience of making music video scenes with text-to-video generative AI.2025CLChungHa Lee et al.Gwangju Institute of Science and Technology, School of Integrated Technology/Soft Computing & Interaction LaboratoryGenerative AI (Text, Image, Music, Video)AI-Assisted Creative WritingVideo Production & EditingCHI
Understanding the Potentials and Limitations of Prompt-based Music Generative AIPrompt-based music generative artificial intelligence (GenAI) offers an efficient way to engage in music creation through language. However, it faces limitations in conveying artistic intent with language alone, highlighting the need for more research on AI-creator interactions. This study evaluates three different interaction modes (prompt-based, preset-based, and motif-based) of commercialized music AI toots with 17 participants of varying musical expertise to examine how prompt-based GenAI can improve creative intention. Our findings revealed that user groups preferred prompt-based music GenAI for distinct purposes: experts used it to validate musical concepts, novices to generate reference samples, and nonprofessionals to transform abstract ideas into musical compositions. We identified its potential for enhancing compositional efficiency and creativity through intuitive interaction, while also noting limitations in handling temporal and musical nuances solely through prompts. Based on these insights, we present design guidelines to ensure users can effectively engage in the creative process, considering their musical expertise.2025YCYoujin Choi et al.Gwangju Institute of Science and Technology, School of Integrated Technology/Soft Computing & Interaction LaboratoryGenerative AI (Text, Image, Music, Video)Music Composition & Sound Design ToolsCHI
BIASsist: Empowering News Readers via Bias Identification, Explanation, and NeutralizationBiased news articles can distort readers' perceptions by presenting information in a way that favors or disfavors a particular point of view. Subtly embedded in the text, these biased news articles can shape our views daily without people even realizing it. To address this issue, we propose BIASsist, an LLM-based approach designed to mitigate bias in news articles. Based on existing research, we defined six types of bias and introduced three assistive components—identification, explanation, and neutralization—to provide a broader range of bias information and enhance readers' bias-awareness. We conducted a mixed-method study with 36 participants to evaluate the effectiveness of BIASsist. The results show participants' bias awareness significantly improved and their interest in identifying bias increased. Participants also tended to engage more actively in critically evaluating articles. Based on these findings, we discuss its potential to improve media literacy and critical thinking in today's information overload era.2025YNYeo-Gyeong Noh et al.Gwangju Institute of Science and Technology, School of Integrated TechnologyAI-Assisted Decision-Making & AutomationAI Ethics, Fairness & AccountabilityMisinformation & Fact-CheckingCHI
Exploring the Potential of Music Generative AI for Music-Making by Deaf and Hard of Hearing PeopleRecent advancements in text-to-music generative AI (GenAI) have significantly expanded access to music creation. However, deaf and hard of hearing (DHH) individuals remain largely excluded from these developments. This study explores how music GenAI could enhance the music-making experience of DHH individuals, who often rely on hearing people to translate sounds and music. We developed a multimodal music-making assistive tool informed by focus group interviews. This tool enables DHH users to create and edit music independently through language interaction with music GenAI, supported by integrated visual and tactile feedback. Our findings from the music-making study revealed that the system empowers them to engage in independent and proactive music-making activities, increasing their confidence, fostering musical expression, and positively shifting their attitudes toward music. Contributing to inclusive art by preserving the unique sensory characteristics of DHH individuals, this study demonstrates how music GenAI can benefit a marginalized community, fostering independent creative expression.2025YCYoujin Choi et al.Gwangju Institute of Science and Technology, School of Integrated Technology/Soft Computing & Interaction LaboratoryGenerative AI (Text, Image, Music, Video)Deaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Music Composition & Sound Design ToolsCHI
A Way for Deaf and Hard of Hearing People to Enjoy Music by Exploring and Customizing Cross-modal Music ConceptsDeaf and hard of hearing (DHH) people enjoy music and access it using a music-sensory substitution system that delivers sound together with the corresponding visual and tactile feedback. However, it is often challenging for them to comprehend the colorful visuals and strong vibrations that are designed to represent music. We confirmed that it is necessary to conceptualize cross-modal mapping before experiencing music sensory substitution through focus group interviews with 24 DHH people. To improve the music appreciation experience, a cross-modal music conceptualization system was implemented herein, which is a prototype that allows DHH people to explore the visuals and vibrations associated with music to perceive and appreciate. An evaluation with 28 DHH individuals demonstrated the capability of the system to improve subjective music appreciation experience via music-sensory substitution. Eventually, DHH people with negative attitudes toward music became positive in the exploration and customization process with our system.2024YCYoujin Choi et al.Gwangju Institute of Science and TechnologyDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Music Composition & Sound Design ToolsCHI
Visible Nuances: A Caption System to Visualize Paralinguistic Speech Cues for Deaf and Hard-of-Hearing IndividualsCaptions help deaf and hard-of-hearing (DHH) individuals visually communicate voice information to better understand video content. In speech, the literal content and paralinguistic cues (e.g., pitch and nuance) work together to create real intention. However, current captions are limited in their capacity to deliver fine nuances because they cannot fully convey these paralinguistic cues. This paper proposes an audio-visualized caption system that automatically visualizes paralinguistic cues into various caption elements (thickness, height, font type and motion). A comparative study with 20 DHH participants demonstrates how our system supports DHH individuals to be better accessible to paralinguistic cues while watching videos. Particularly in the case of formal talks, they could accurately identify the speaker’s nuance more often compared to current captions, without any practice or training. Addressing some issues on legibility and familiarity, the proposed caption system has potentials to enrich DHH individuals’ video watching experience more as hearing people enjoy.2023JKJooYeong Kim et al.Gwangju Institute of Science and TechnologyVoice AccessibilityDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Universal & Inclusive DesignCHI
We Play and Learn Rhythmically: Gesture-based Rhythm Game for Children with Intellectual Developmental Disabilities to Learn Manual SignManual sign systems have been introduced to improve the communication of children with intellectual developmental disabilities (IDD). Due to the lack of learning support tools, teachers face many practical challenges in teaching manual sign to children, such as low attention span and the need for persistent intervention. To address these issues, we collaborated with teachers to develop the Sondam Rhythm Game, a gesture-based rhythm game that assists in teaching manual sign language, and ran a four-week empirical study with five teachers and eight children with IDD. Based on video annotation and post-hoc interviews, our game-based learning approach has the potential to be effective at teaching manual sign to children with IDD. Our approach improved children attention span and motivation while also increasing the number of voluntary gestures made without the need for prompting. Other practical issues and learning challenges were also uncovered to improve teaching paradigms for children with IDD.2022YCYoujin Choi et al.Gwangju Institute of Science and TechnologyHand Gesture RecognitionSerious & Functional GamesSpecial Education TechnologyCHI
Styling Words: A Simple and Natural Way to Increase Variability in Training Data Collection for Gesture RecognitionDue to advances in deep learning, gestures have become a more common tool for human-computer interaction. When implementing a large amount of training data, deep learning models show remarkable performance in gesture recognition. Since it is expensive and time consuming to collect gesture data from people, we are often confronted with a practicality issue when managing the quantity and quality of training data. It is a well-known fact that increasing training data variability can help to improve the generalization performance of machine learning models. Thus, we directly intervene in the collection of gesture data to increase human gesture variability by adding some words (called styling words) into the data collection instructions, e.g., giving the instruction "perform gesture #1 faster" as opposed to "perform gesture #1." Through an in-depth analysis of gesture features and video-based gesture recognition, we have confirmed the advantageous use of styling words in gesture training data collection.2021WKWoojin Kang et al.Gwangju Institute of Science and TechnologyHand Gesture RecognitionVisualization Perception & CognitionCHI