MapStory: Prototyping Editable Map Animations with LLM AgentsWe introduce MapStory, an LLM‑powered animation prototyping tool that generates editable map animation sequences directly from natural language text by leveraging a dual-agent LLM architecture. Given a user-written script, MapStory automatically produces a scene breakdown, which decomposes the text into key map animation primitives such as camera movements, visual highlights, and animated elements. Our system includes a researcher agent that accurately queries geospatial information by leveraging an LLM with web search, enabling automatic extraction of relevant regions, paths, and coordinates while allowing users to edit and query for changes or additional information to refine the results. Additionally, users can fine-tune parameters of these primitive blocks through an interactive timeline editor. We detail the system’s design and architecture, informed by formative interviews with professional animators and by an analysis of 200 existing map animation videos. Our evaluation, which includes expert interviews (N=5), and a usability study (N=12), demonstrates that MapStory enables users to create map animations with ease, facilitates faster iteration, encourages creative exploration, and lowers barriers to creating map-centric stories.2025AGAditya Gunturu et al.Geospatial & Map VisualizationComputational Methods in HCIUIST
Vestibular Stimulation Enhances Hand RedirectionWe demonstrate how the vestibular system (i.e., the sense of balance) influences the perception of hand position in VR. By exploiting this via galvanic vestibular stimulation (GVS), we can enhance the degree to which we can redirect the user’s hands in VR without them noticing, i.e., raising the detection threshold of hand redirection. Our novel cross-modal illusion relies on the principle that a GVS-induced subtle body sway aligns with the user’s expected body balance during hand redirection. This alignment reduces the sensory conflict between the expected and actual body balance, allowing for a larger hand redirection than would normally be noticed. In our user study, we validated that our approach raises the detection threshold of VR hand redirection by approximately 55% for outward and 45% for inward movements. With this increase, our approach broadens the applicability of hand redirection (e.g., compressing a VR space into an even smaller physical area).2025KKKensuke Katori et al.Force Feedback & Pseudo-Haptic WeightShape-Changing Interfaces & Soft Robotic MaterialsFull-Body Interaction & Embodied InputUIST
Video2MR: Automatically Generating Mixed Reality 3D Instructions by Augmenting Extracted Motion from 2D VideosThis paper introduces Video2MR, a mixed reality system that automatically generates 3D sports and exercise instructions from 2D videos. Mixed reality instructions have great potential for physical training, but existing works require substantial time and cost to create these 3D experiences. Video2MR overcomes this limitation by transforming arbitrary instructional videos available online into MR 3D avatars with AI-enabled motion capture (DeepMotion). Then, it automatically enhances the avatar motion through the following augmentation techniques: 1) contrasting and highlighting differences between the user and avatar postures, 2) visualizing key trajectories and movements of specific body parts, 3) manipulation of time and speed using body motion, and 4) spatially repositioning avatars for different perspectives. Developed on Hololens 2 and Azure Kinect, we showcase various use cases, including yoga, dancing, soccer, tennis, and other physical exercises. The study results confirm that Video2MR provides more engaging and playful learning experiences, compared to existing 2D video instructions.2025KIKeiichi Ihara et al.Full-Body Interaction & Embodied InputMixed Reality WorkspacesBiosensors & Physiological MonitoringIUI
Understanding Usability of VR Pointing Methods with a Handheld-style HMD for Onsite ExhibitionsHandheld-style head-mounted displays (HMDs) are becoming increasingly popular as a convenient option for onsite exhibitions. However, they lack established practices for basic interactions, particularly pointing methods. Through our formative study involving practitioners, we discovered that controllers and hand gestures are the primary pointing methods being utilized. Building upon these findings, we conducted a usability study to explore seven different pointing methods, incorporating insights from the formative study and current virtual reality (VR) practices. The results showed that while controllers remain a viable option, hand gestures are not recommended. Notably, dwell time-based methods, which are not fast and are not commonly recognized by practitioners, demonstrate high usability and user confidence, particularly for inexperienced VR users. We recommend the use of dwell-based methods for onsite exhibition contexts. This research provides insights for the adoption of handheld-style HMDs, laying the groundwork for improving user interaction in exhibition environments, thereby potentially enhancing visitor experiences.2025YAYuki Abe et al.Hokkaido University, Human-Computer Interaction LabEye Tracking & Gaze InteractionSocial & Collaborative VRImmersion & Presence ResearchCHI
Understanding and Supporting Formal Email Exchange by Answering AI-Generated QuestionsReplying to formal emails is time-consuming and cognitively demanding, as it requires crafting polite phrasing and providing an adequate response to the sender's demands. Although systems with Large Language Models (LLMs) were designed to simplify the email replying process, users still need to provide detailed prompts to obtain the expected output. Therefore, we propose and evaluate an LLM-powered question-and-answer (QA)-based approach for users to reply to emails by answering a set of simple and short questions generated from the incoming email. We developed a prototype system, ResQ, and conducted controlled and field experiments with 12 and 8 participants. Our results demonstrated that the QA-based approach improves the efficiency of replying to emails and reduces workload while maintaining email quality, compared to a conventional prompt-based approach that requires users to craft appropriate prompts to obtain email drafts. We discuss how the QA-based approach influences the email reply process and interpersonal relationship dynamics, as well as the opportunities and challenges associated with using a QA-based approach in AI-mediated communication.2025YMYusuke Miura et al.Waseda UniversityHuman-LLM CollaborationCHI
Exploring the Design of LLM-based Agent in Enhancing Self-disclosure Among the Older AdultsSocial difficulties have become an increasingly serious issue among older adults. For older adults, regular self-disclosure is essential for maintaining mental health and building close relationships. Leveraging conversational agents to encourage self-disclosure in older adults has shown increasing potential. Understanding how LLM-based agents can influence and stimulate self-disclosure across different topics is crucial for designing future agents tailored to older users. This study introduces Disclosure-Agent, an LLM-based conversational agent, and examines its impact on self-disclosure in older adults through a user study involving 20 participants, 8 topics, and two interactive interfaces equipped with Disclosure-Agent. The findings provide valuable insights into how LLM-based agents can promote self-disclosure in older adults and offer design recommendations for future elderly-oriented conversational agents.2025YGYijie Guo et al.Tsinghua University, Academy of Arts and Design; Tsinghua University, The Future LaboratoryAgent Personality & AnthropomorphismHuman-LLM CollaborationCHI
"Closer than Real": How Social VR Platform Features Influence Friendship DynamicsSocial virtual reality (VR) platforms offer unique features that can foster interpersonal relationships that are "closer than real." This study investigates how these platform features influence friendship dynamics in social VR. Through semi-structured interviews with 23 Japanese VRChat users, we explored the characteristics of close relationships formed in social VR, the processes of relationship development, and the role of platform features in shaping these dynamics. Our findings reveal that social VR facilitates a form of selective self-presentation and co-presence through embodied avatars and rich environmental contexts, which can lead to rapid and intense friendship formation. Users reported developing close bonds without relying on real-life background information, instead focusing on perceived familiarity and compatibility within the virtual space, highlighted by the avatar's appearance. Further, platform features such as ``join'' functions that allow users to teleport to friends' locations, were assigned special meanings by users, contributing to developing friendships.2025MHMisato Hide et al.The University of TokyoSocial & Collaborative VRImmersion & Presence ResearchIdentity & Avatars in XRCHI
FlexEar-Tips: Shape-Adjustable Ear Tips Using Pressure ControlWe introduce FlexEar-Tips, a dynamic ear tip system designed for the next-generation hearables. The ear tips are controlled by an air pump and solenoid valves, enabling size adjustments for comfort and functionality. FlexEar-Tips includes an air pressure sensor to monitor ear tip size, allowing it to adapt to environmental conditions and user needs. In the evaluation, we conducted a preliminary investigation of the size control accuracy and the minimum amount of variability of haptic perception in the user's ear. We then evaluated the user's ability to identify patterns in the haptic notification system, the impact on the music listening experience, the relationship between the size of the ear tips and the sound localization ability, and the impact on the reduction of humidity in the ear using a model. We proposed new interaction modalities for adaptive hearables and discussed health monitoring, immersive auditory experiences, haptics notifications, biofeedback, and sensing.2025TATakashi Amesaka et al.Keio University, Lifestyle Computing LabHaptic WearablesShape-Changing Interfaces & Soft Robotic MaterialsCHI
PrISM-Observer: Intervention Agent to Help Users Perform Everyday Procedures Sensed using a SmartwatchWe routinely perform procedures (such as cooking) that include a set of atomic steps. Often, inadvertent omission or misordering of a single step can lead to serious consequences, especially for those experiencing cognitive challenges such as dementia. This paper introduces PrISM-Observer, a smartwatch-based, context-aware, real-time intervention system designed to support daily tasks by preventing errors. Unlike traditional systems that require users to seek out information, the agent observes user actions and intervenes proactively. This capability is enabled by the agent's ability to continuously update its belief in the user's behavior in real-time through multimodal sensing and forecast optimal intervention moments and methods. We first validated the steps-tracking performance of our framework through evaluations across three datasets with different complexities. Then, we implemented a real-time agent system using a smartwatch and conducted a user study in a cooking task scenario. The system generated helpful interventions, and we gained positive feedback from the participants. The general applicability of PrISM-Observer to daily tasks promises broad applications, for instance, including support for users requiring more involved interventions, such as people with dementia or post-surgical patients.2024RARiku Arakawa et al.Fitness Tracking & Physical Activity MonitoringElderly Care & Dementia SupportContext-Aware ComputingUIST
EarHover: Mid-Air Gesture Recognition for Hearables Using Sound Leakage SignalsWe introduce EarHover, an innovative system that enables mid-air gesture input for hearables. Mid-air gesture input, which eliminates the need to touch the device and thus helps to keep hands and the device clean, has been known to have high demand based on previous surveys. However, existing mid-air gesture input methods for hearables have been limited to adding cameras or infrared sensors. By focusing on the sound leakage phenomenon unique to hearables, we have realized mid-air gesture recognition using a speaker and an external microphone that are highly compatible with hearables. The signal leaked to the outside of the device due to sound leakage can be measured by an external microphone, which detects the differences in reflection characteristics caused by the hand's speed and shape during mid-air gestures. Among 27 types of gestures, we determined the seven most suitable gestures for EarHover in terms of signal discrimination and user acceptability. We then evaluated the gesture detection and classification performance of two prototype devices (in-ear type/open-ear type) for real-world application scenarios.2024SSShunta Suzuki et al.In-Vehicle Haptic, Audio & Multimodal FeedbackHand Gesture RecognitionUIST
HIFU Embossment of Acrylic SheetsTactile interfaces such as embossment facilitate information transfer through touch in Human-Computer Interaction (HCI). Traditional embossing methods, while enabling the creation of intricate patterns, face limitations due to mold reliance and material thickness restrictions, hindering bespoke embossment creation. In this study, we propose High-Intensity Focused Ultrasound (HIFU) as an alternative technique to produce tailored embossed designs on acrylic without the need for traditional molds. We uncover specific HIFU parameters, such as amplitude, irradiation time, and distance that directly impact essential qualities of embossment including embossment height, transparency, and line generation. Additionally, the capability of embossing without the use of molds expands the applications for quick prototyping and customization of embossed designs within HCI. Furthermore, we introduce a user interface developed to streamline the design and application of customizable tactile graphics using HIFU, aimed at non-expert users. Preliminary user studies reveal positive feedback on the interface’s intuitiveness and the quality of the HIFU embossment. Our study indicates that HIFU embossment presents a viable approach for creating embossed features in interactive systems, with the potential to offer methods for personal customization in the design of tactile materials.2024ATAyaka Tsutsui et al.University of TsukubaMid-Air Haptics (Ultrasonic)Shape-Changing Interfaces & Soft Robotic MaterialsCHI
Come Fly With Me - Investigating the Effects of Path Visualizations in Automated Urban Air Mobility"Automated Urban Air Mobility will enhance passenger transportation in metropolitan areas in the near future. Potential passengers, however, have little knowledge about this mobility form. Therefore, there could be concerns about safety and low trust. As trajectories are essential information to address these concerns, we evaluated seven path visualizations in an online video-based study (N=99). We found that a path line visualization was rated highest for trust and perceived safety. In a follow-up virtual reality study (N=24), we evaluated the effects of this visualization and of other air traffic flying by. We found that the participants looked at the path line more often when other air traffic was present and that the path line increased trust and predictability of the air taxi's future path. https://doi.org/10.1145/3596249"2023MCMark Colley et al.AR Navigation & Context AwarenessPublic Transit & Trip PlanningUbiComp
HoloBots: Augmenting Holographic Telepresence with Mobile Robots for Tangible Remote Collaboration in Mixed RealityThis paper introduces HoloBots, a mixed reality remote collaboration system that augments holographic telepresence with synchronized mobile robots. Beyond existing mixed reality telepresence, HoloBots lets remote users not only be visually and spatially present, but also \textit{physically} engage with local users and their environment. HoloBots allows the users to touch, grasp, manipulate, and interact with the remote physical environment as if they were co-located in the same shared space. We achieve this by synchronizing holographic user motion (Hololens 2 and Azure Kinect) with tabletop mobile robots (Sony Toio). Beyond the existing physical telepresence, HoloBots contributes to an exploration of broader design space, such as object actuation, virtual hand physicalization, world-in-miniature exploration, shared tangible interfaces, embodied guidance, and haptic communication. We evaluate our system with twelve participants by comparing it with hologram-only and robot-only conditions. Both quantitative and qualitative results confirm that our system significantly enhances the level of co-presence and shared experience, compared to the other conditions.2023KIKeiichi Ihara et al.Teleoperated DrivingMixed Reality WorkspacesTeleoperation & TelepresenceUIST
User Authentication Method for Hearables Using Sound Leakage SignalsWe propose a novel biometric authentication method that leverages sound leakage signals from hearables that are captured by an external microphone. A sweep signal is played from hearables, and sound leakage is recorded using an external microphone. This sound leakage signal represents the acoustic characteristics of the ear canal, auricle, or hand. Then, our system analyzes the echoes and authenticates the user. The proposed method is highly adaptable to hearables because it leverages widely available sensors, such as speakers and external microphones. In addition, the proposed method has the potential to be used in combination with existing methods. In this study, we investigate the characteristics of sound leakage signals using an experimental model and measure the authentication performance of our method using acoustic data from 16 people. The results show that the balanced accuracy (BAC) scores were in the range of 87.0%-96.7% in several scenarios.2023TATakashi Amesaka et al.Passwords & AuthenticationUbiComp
Affective Profile Pictures: Exploring the Effects of Changing Facial Expressions in Profile Pictures on Text-Based CommunicationWhen receiving text messages from unacquainted colleagues in fully remote workplaces, insufficient mutual understanding and limited social cues can lead people to misinterpret the tone of the message and further influence their impression of remote colleagues. Emojis have been commonly used for supporting expressive communication; however, people seldom use emojis before they become acquainted with each other. Hence, we explored how changing facial expressions in profile pictures could be an alternative channel to communicate socio-emotional cues. By conducting an online controlled experiment with 186 participants, we established that changing facial expressions of profile pictures can influence the impression of the message receivers toward the sender and the message valence when receiving neutral messages. Furthermore, presenting incongruent profile pictures to positive messages negatively affected the interpretation of the message valence, but did not have much effect on negative messages. We discuss the implications of affective profile pictures in supporting text-based communication.2023CYChi-Lan Yang et al.The University of TokyoVoice User Interface (VUI) DesignOnline Identity & Self-PresentationCHI
Photographic Lighting Design with Photographer-in-the-Loop Bayesian OptimizationIt is important for photographers to have the best possible lighting configuration at the time of shooting; otherwise, they need post-processing on images, which may cause artifacts and deterioration. Thus, photographers often struggle to find the best possible lighting configuration by manipulating lighting devices, including light sources and modifiers, in a trial-and-error manner. In this paper, we propose a novel computational framework to support photographers. This framework assumes that every lighting device is programmable; that is, its adjustable parameters (e.g., orientation, intensity, and color temperature) can be set using a program. Using our framework, photographers do not need to learn how the parameter values affect the resulting lighting, and even do not need to determine the strategy of the trial-and-error process; instead, photographers need only concentrate on evaluating which lighting configuration is more desirable among options suggested by the system. The framework is enabled by our novel photographer-in-the-loop Bayesian optimization, which is sample-efficient (i.e., the number of required evaluation steps is small) and which can also be guided by providing a rough painting of the desired lighting configuration if any. We demonstrate how the framework works in both simulated virtual environments and a physical environment, suggesting that it could find pleasing lighting configurations quickly in around 10 iterations. Our user study suggests that the framework enables the photographer to concentrate on the look of captured images rather than the parameters, compared with the traditional manual lighting workflow.2022KYYuki Koyama et al.Generative AI (Text, Image, Music, Video)Photography & Image ProcessingUIST
VocabEncounter: NMT-powered Vocabulary Learning by Presenting Computer-Generated Usages of Foreign Words into Users' Daily LivesWe demonstrate that recent natural language processing (NLP) techniques introduce a new paradigm of vocabulary learning that benefits from both micro and usage-based learning by generating and presenting the usages of foreign words based on the learner's context. Then, without allocating dedicated time for studying, the user can become familiarized with how the words are used by seeing the example usages during daily activities, such as Web browsing. To achieve this, we introduce VocabEncounter, a vocabulary-learning system that suitably encapsulates the given words into materials the user is reading in near real time by leveraging recent NLP techniques. After confirming the system's human-comparable quality of generating translated phrases by involving crowdworkers, we conducted a series of user studies, which demonstrated its effectiveness on learning vocabulary and its favorable experiences. Our work shows how NLP-based generation techniques can transform our daily activities into a field for vocabulary learning.2022RARiku Arakawa et al.Carnegie Mellon UniversityGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationProgramming Education & Computational ThinkingCHI
Reaction or Speculation: Building Computational Support for Users in Catching-Up Series Based on an Emerging Media Consumption PhenomenonA growing number of people are using catch-up TV services rather than watching simultaneously with other audience members at the time of broadcast. However, computational support for such catching-up users has not been well explored. In particular, we are observing an emerging phenomenon in online media consumption experiences in which speculation plays a vital role. As the phenomenon of speculation implicitly assumes simultaneity in media consumption, there is a gap for catching-up users, who cannot directly appreciate the consumption experiences. This conversely suggests that there is potential for computational support to enhance the consumption experiences of catching-up users. Accordingly, we conducted a series of studies to pave the way for developing computational support for catching-up users. First, we conducted semi-structured interviews to understand how people are engaging with speculation during media consumption. As a result, we discovered the distinctive aspects of speculation-based consumption experiences in contrast to previously-discussed social viewing experiences through sharing immediate reactions. We then designed two prototypes for supporting catching-up users based on our quantitative analysis of Twitter data in regard to reaction- and speculation-based media consumption. Lastly, we evaluated them in a user study and, based on its results, discussed ways to empower catching-up users with the support of computers in response to recent transformations in media consumption.2021RARiku Arakawa et al.User ExperiencesCSCW
Task Assignment Strategies for Crowd Worker Ability ImprovementWorkers are the most important resource in crowdsourcing. However, only investing in worker-centric needs, such as skill improvement, often conflicts with short-term platform-centric needs, such as task throughput. This paper studies learning strategies in task assignment in crowdsourcing and their impact on platform-centric needs. We formalize learning potential of individual tasks and collaborative tasks, and devise an iterative task assignment and completion approach that implements strategies grounded in learning theories. We conduct experiments to compare several learning strategies in terms of skill improvement, and in terms of task throughput and contribution quality. We discuss how our findings open new research directions in learning and collaboration.2021MMMasaki Matsubara et al.Crowds and CollaborationCSCW
Exploring Text Revision with Backspace and Caret in Virtual RealityCurrent VR systems provide various text input methods that enable users to enter text efficiently with virtual keyboards. However, little attention has been paid to facilitate text revision during the VR text input process. We first summarized existing text revision solutions in current VR text input research and found that backspace is the only tool available for text revision with virtual keyboards with few mentioning designs for caret control. To systematically explore VR text revision designs, we presented a design space for VR text revision based on backspace and caret. With the proposed design space, we further analyzed the feasibility of the combined usage of backspace and caret by proposing and evaluating four VR text revision techniques. Outcomes of this research can provide a fundamental understanding of VR text revision solutions (with backspace and caret) and a comparable basis for evaluating future VR text revision techniques.2021YLYang Li et al.Kochi University of Technology, Kochi University of TechnologySocial & Collaborative VRCHI