Toward More Inclusive Music Experience: Understanding Deaf and Hard-of-hearing Individuals’ Everyday Music ActivitiesMusic can play an important role in the lives of some Deaf and Hard-of-Hearing (DHH) individuals, facilitating emotional expression, storytelling, and social interaction despite differences in hearing ability and identity. While prior human-computer interaction (HCI) research has introduced various functional advancements to enhance their music experiences, a deeper exploration of broader user experiences and inclusive design strategies remains necessary. To address the real-life challenges DHH individuals face in everyday music activities, we conducted focus group interviews \needtocheck{in South Korea} with 39 DHH individuals and 9 music experts. Our analysis identified six dimensions of everyday music activities organized by engagement type and social level, highlighting the distinct challenges and preferences DHH individuals encounter in musical contexts. Based on these insights, we propose design implications for fostering more inclusive music experiences, extending beyond individual engagement to include community and mixed-group interactions. This work provides a comprehensive framework to inform future HCI research and guide the development of inclusive technologies that better support DHH individuals’ diverse musical experiences.2025HYHyeonBeom Yi et al.Deaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Universal & Inclusive DesignMusic Composition & Sound Design ToolsDIS
ReachPad: Interacting with Multiple Virtual Screens using a Single Physical Pad through Haptic RetargetingThe advancement of Virtual Reality (VR) has expanded 2D user interfaces into 3D space. This change has introduced richer interaction modalities but also brought challenges, especially the lack of haptic feedback in mid-air interactions. Previous research has explored various methods to provide feedback for interface interactions, but most approaches require specialized haptic devices. We introduce haptic retargeting to enable users to control multiple virtual screens in VR using a simple flat pad, which serves as a single physical proxy to support seamless interaction across multiple virtual screens. We conducted user studies to explore the appropriate virtual screen size and positioning under our retargeting method and then compared various drag-and-drop methods for cross-screen interaction. Finally, we compared our method with controller-based interaction in application scenarios.2025HSHan Shi et al.Southern University of Science and Technology; Fudan UniversityIn-Vehicle Haptic, Audio & Multimodal FeedbackMixed Reality WorkspacesImmersion & Presence ResearchCHI
Speeding up Inference with User Simulators through Policy ModulationThe simulation of user behavior with deep reinforcement learning agents has shown some recent success. However, the inverse problem, that is, inferring the free parameters of the simulator from observed user behaviors, remains challenging to solve. This is because the optimization of the new action policy of the simulated agent, which is required whenever the model parameters change, is computationally impractical. In this study, we introduce a network modulation technique that can obtain a generalized policy that immediately adapts to the given model parameters. Further, we demonstrate that the proposed technique improves the efficiency of user simulator-based inference by eliminating the need to obtain an action policy for novel model parameters. We validated our approach using the latest user simulator for point-and-click behavior. Consequently, we succeeded in inferring the user’s cognitive parameters and intrinsic reward settings with less than 1/1000 computational power to those of existing methods.2022HMHee-Seung Moon et al.Yonsei UniversityHuman-LLM CollaborationCHI
SGToolkit: An Interactive Gesture Authoring Toolkit for Embodied Conversational AgentsNon-verbal behavior is essential for embodied agents like social robots, virtual avatars, and digital humans. Existing behavior authoring approaches including keyframe animation and motion capture are too expensive to use when there are numerous utterances requiring gestures. Automatic generation methods show promising results, but their output quality is not satisfactory yet, and it is hard to modify outputs as a gesture designer wants. We introduce a new gesture generation toolkit, named SGToolkit, which gives a higher quality output than automatic methods and is more efficient than manual authoring. For the toolkit, we propose a neural generative model that synthesizes gestures from speech and accommodates fine-level pose controls and coarse-level style controls from users. The user study with 24 participants showed that the toolkit is favorable over manual authoring, and the generated gestures were also human-like and appropriate to input speech. The SGToolkit is platform agnostic, and the code is available at https://github.com/ai4r/SGToolkit.2021YYYoungwoo Yoon et al.Agent Personality & AnthropomorphismHuman-Robot Collaboration (HRC)UIST
A large, crowdsourced evaluation of gesture generation systems on common data: The GENEA Challenge 2020Co-speech gestures, gestures that accompany speech, play an important role in human communication. Automatic co-speech gesture generation is thus a key enabling technology for embodied conversational agents (ECAs), since humans expect ECAs to be capable of multi-modal communication. Research into gesture generation is rapidly gravitating towards data-driven methods. Unfortunately, individual research efforts in the field are difficult to compare: there are no established benchmarks, and each study tends to use its own dataset, motion visualisation, and evaluation methodology. To address this situation, we launched a gesture-generation challenge, wherein participating teams built automatic gesture-generation systems on a common dataset, and the resulting systems were evaluated in parallel in a large, crowdsourced user study using the same motion-rendering pipeline. Since differences in evaluation outcomes between systems now are solely attributable to differences between the motion-generation methods, this enables benchmarking recent approaches against one another in order to get a better impression of the state of the art in the field. This paper reports on the purpose, design, and results of our challenge.2021TKTaras Kucherenko et al.Full-Body Interaction & Embodied InputHuman Pose & Activity RecognitionConversational ChatbotsIUI
DeepFisheye: Near-Surface Multi-Finger Tracking Technology Using Fisheye CameraNear-surface multi-finger tracking (NMFT) technology expands the input space of touchscreens by enabling novel interactions such as mid-air and finger-aware interactions. We present DeepFisheye, a practical NMFT solution for mobile devices, that utilizes a fisheye camera attached at the bottom of a touchscreen. DeepFisheye acquires the image of an interacting hand positioned above the touchscreen using the camera and employs deep learning to estimate the 3D position of each fingertip. We created two new hand pose datasets comprising fisheye images, on which our network was trained. We evaluated DeepFisheye’s performance for three device sizes. DeepFisheye showed average errors with approximate value of 20 mm for fingertip tracking across the different device sizes. Additionally, we created simple rule-based classifiers that estimate the contact finger and hand posture from DeepFisheye’s output. The contact finger and hand posture classifiers showed accuracy of approximately 83 and 90%, respectively, across the device sizes.2020KPKeunwoo Park et al.Hand Gesture RecognitionEye Tracking & Gaze InteractionKnowledge Worker Tools & WorkflowsUIST
Whiskers: Exploring the Use of Ultrasonic Haptic Cues on the FaceHaptic cues are a valuable feedback mechanism for smart glasses. Prior work has shown how they can support navigation, deliver notifications and cue targets. However, a focus on actuation technologies such as mechanical tactors or fans has restricted the scope of research to a small number of cues presented at fixed locations. To move beyond this limitation, we explore perception of in-air ultrasonic haptic cues on the face. We present two studies examining the fundamental properties of localization, duration and movement perception on three facial sites suitable for use with glasses: the cheek, the center of the forehead, and above the eyebrow. The center of the forehead led to optimal performance with a localization error of 3.77mm and accurate duration (80%) and movement perception (87%). We apply these findings in a study delivering eight different ultrasonic notifications and report mean recognition rates of up to 92.4% (peak: 98.6%). We close with design recommendations for ultrasonic haptic cues on the face.2018HGHyunjae Gil et al.Ulsan National Institute of Science and TechnologyIn-Vehicle Haptic, Audio & Multimodal FeedbackMid-Air Haptics (Ultrasonic)Vibrotactile Feedback & Skin StimulationCHI
Mid-Air Haptics for Control InterfacesControl interfaces and interactions based on touch-less gesture tracking devices have become a prevalent research topic in both industry and academia. Touch-less devices offer a unique interaction immediateness that makes them ideal for applications where direct contact with a physical controller is not desirable. On the other hand, these controllers inherently lack active or passive haptic feedback to inform users about the results of their interaction. Mid-air haptic interfaces, such as those using focused ultrasound waves, can close the feedback loop and provide new tools for the design of touch-less, un-instrumented control interactions. The goal of this workshop is to bring together the growing mid-air haptic research community to identify and discuss future challenges in control interfaces and their application in AR/VR, automotive, music, robotics and teleoperation.2018MGMarcello Giordano et al.UltrahapticsMid-Air Haptics (Ultrasonic)CHI