IMUCoCo: Enabling Flexible On-Body IMU Placement for Human Pose Estimation and Activity RecognitionIMUs are regularly used to sense human motion, recognize activities, and estimate full-body pose. Users are typically required to place sensors in predefined locations that are often dictated by common wearable form factors and the machine learning model's training process. Consequently, despite the increasing number of everyday devices equipped with IMUs, the limited adaptability has seriously constrained the user experience to only using a few well-explored device placements (e.g., wrist and ears). In this paper, we rethink IMU-based motion sensing by acknowledging that signals can be captured from any point on the human body. We introduce IMU over Continuous Coordinates (IMUCoCo), a novel framework that maps signals from a variable number of IMUs placed on the body surface into a unified feature space based on their spatial coordinates. These features can be plugged into downstream models for pose estimation and activity recognition. Our evaluations demonstrate that IMUCoCo supports accurate pose estimation in a wide range of typical and atypical sensor placements. Overall, IMUCoCo supports significantly more flexible use of IMUs for motion sensing than the state-of-the-art, allowing users to place their sensors-laden devices according to their needs and preferences. The framework also supports the ability to change device locations depending on the context and suggests placement depending on the use case.2025HZHaozhe Zhou et al.Human Pose & Activity RecognitionUIST
Scaling Context-Aware Task Assistants that Learn from Demonstration and Adapt through Mixed-Initiative DialogueDaily tasks such as cooking, machine operation, and medical self-care often require context-aware assistance, yet existing systems are hard to scale due to high training costs and unpredictable and imperfect performance. This work introduces the PrISM framework, which streamlines the process of creating an assistant for users' own tasks using demonstration and dialogue. First, our tracking algorithm effectively learns sensor representation for steps in procedures from a single demonstration. Second, and critically, to tackle the challenges of sensing imperfections and unpredictable user behaviors, we implement a dialogue-based context adaptation mechanism. The dialogue refines the system's understanding in real time, thereby reducing errors such as inappropriate responses to user queries. Evaluated through multiple studies involving several examples of daily tasks in a user's life, our approach demonstrates improved step-tracking accuracy, enhanced user interaction, and an improved sense of collaboration. These results promise a scalable, multimodal, context-aware assistant that effectively bridges the gap between human guidance and adaptive support in diverse real-world applications.2025RARiku Arakawa et al.Voice User Interface (VUI) DesignContext-Aware ComputingUbiquitous ComputingUIST
UbiLearn: Supporting English-as-a-Foreign-Language Learners in Reflecting on Conversations Using a SmartwatchWhat new opportunities can the current ubiquitous computing and AI technologies provide to support English-as-a-Foreign-Language (EFL) learners? To answer the question, we began with a formative study with EFL learners, uncovering multiple challenges during conversations with others and their desire to review such scenes later. We implemented a smartwatch prototype, UbiLearn, which features hand gesture recognition for in-situ multi-context annotation to save moments when learners face difficulty. The annotation is used to generate personalized educational material powered by speech and natural language processing. Through a series of studies, we demonstrated the feasibility and preferred usability of UbiLearn, leading to learners' enhanced learning satisfaction. Moreover, the annotation data promoted the role of instructors by enabling the tracking of learners' in-situ proficiency outside their tutoring sessions. We conclude by highlighting emerging opportunities for learners enabled by mobile and AI technologies, along with key considerations.2025RARiku Arakawa et al.Intelligent Voice Assistants (Alexa, Siri, etc.)Collaborative Learning & Peer TeachingContext-Aware ComputingMobileHCI
PrISM-Observer: Intervention Agent to Help Users Perform Everyday Procedures Sensed using a SmartwatchWe routinely perform procedures (such as cooking) that include a set of atomic steps. Often, inadvertent omission or misordering of a single step can lead to serious consequences, especially for those experiencing cognitive challenges such as dementia. This paper introduces PrISM-Observer, a smartwatch-based, context-aware, real-time intervention system designed to support daily tasks by preventing errors. Unlike traditional systems that require users to seek out information, the agent observes user actions and intervenes proactively. This capability is enabled by the agent's ability to continuously update its belief in the user's behavior in real-time through multimodal sensing and forecast optimal intervention moments and methods. We first validated the steps-tracking performance of our framework through evaluations across three datasets with different complexities. Then, we implemented a real-time agent system using a smartwatch and conducted a user study in a cooking task scenario. The system generated helpful interventions, and we gained positive feedback from the participants. The general applicability of PrISM-Observer to daily tasks promises broad applications, for instance, including support for users requiring more involved interventions, such as people with dementia or post-surgical patients.2024RARiku Arakawa et al.Fitness Tracking & Physical Activity MonitoringElderly Care & Dementia SupportContext-Aware ComputingUIST
Coaching Copilot: Blended Form of an LLM-Powered Chatbot and a Human Coach to Effectively Support Self-Reflection for Leadership GrowthChatbots' role in fostering self-reflection is now widely recognized, especially in inducing users' behavior change. While the benefits of 24/7 availability, scalability, and consistent responses have been demonstrated in contexts such as healthcare and tutoring to help one form a new habit, their utilization in coaching necessitating deeper introspective dialogue to induce leadership growth remains unexplored. This paper explores the potential of such a chatbot powered by recent Large Language Models (LLMs) in collaboration with professional coaches in the field of executive coaching. Through a design workshop with them and two weeks of user study involving ten coach-client pairs, we explored the feasibility and nuances of integrating chatbots to complement human coaches. Our findings highlight the benefits of chatbots' ubiquity and reasoning capabilities enabled by LLMs while identifying their limitations and design necessities for effective collaboration between human coaches and chatbots. By doing so, this work contributes to the foundation for augmenting one's self-reflective process with prevalent conversational agents through the human-in-the-loop approach.2024RARiku Arakawa et al.Conversational ChatbotsHuman-LLM CollaborationCUI
MI-Poser: Human Body Pose Tracking Using Magnetic and Inertial Sensor Fusion with Metal Interference Mitigation"Inside-out tracking of human body poses using wearable sensors holds significant potential for AR/VR applications, such as remote communication through 3D avatars with expressive body language. Current inside-out systems often rely on vision-based methods utilizing handheld controllers or incorporating densely distributed body-worn IMU sensors. The former limits hands-free and occlusion-robust interactions, while the latter is plagued by inadequate accuracy and jittering. We introduce a novel body tracking system, MI-Poser, which employs AR glasses and two wrist-worn electromagnetic field (EMF) sensors to achieve high-fidelity upper-body pose estimation while mitigating metal interference. Our lightweight system demonstrates a minimal error (6.6 cm mean joint position error) with real-world data collected from 10 participants. It remains robust against various upper-body movements and operates efficiently at 60 Hz. Furthermore, by incorporating an IMU sensor co-located with the EMF sensor, MI-Poser presents solutions to counteract the effects of metal interference, which inherently disrupts the EMF signal during tracking. Our evaluation effectively showcases the successful detection and correction of interference using our EMF-IMU fusion approach across environments with diverse metal profiles. Ultimately, MI-Poser offers a practical pose tracking system, particularly suited for body-centric AR applications." https://doi.org/10.1145/36108912023RARiku Arakawa et al.Human Pose & Activity RecognitionUbiComp
LemurDx: Using Unconstrained Passive Sensing for an Objective Measurement of Hyperactivity in Children with no Parent InputHyperactivity is the most dominant presentation of Attention-Deficit/Hyperactivity Disorder in young children. Currently, measuring hyperactivity involves parents' or teachers' reports. These reports are vulnerable to subjectivity and can lead to misdiagnosis. LemurDx provides an objective measure of hyperactivity using passive mobile sensing. We collected data from 61 children (25 with hyperactivity) who wore a smartwatch for up to 7 days without changing their daily routine. The participants' parents maintained a log of the child's activities at a half-hour granularity (e.g., sitting, exercising) as contextual information. Our ML models achieved 85.2% accuracy in detecting hyperactivity in children (using parent-provided activity labels). We also built models that estimated children's context from the sensor data and did not rely on activity labels to reduce parent burden. These models achieved 82.0% accuracy in detecting hyperactivity. In addition, we interviewed five clinicians who suggested a need for a tractable risk score that enables analysis of a child's behavior across contexts. Our results show the feasibility of supporting the diagnosis of hyperactivity by providing clinicians with an interpretable and objective score of hyperactivity using off-the-shelf watches and adding no constraints to children or their guardians. https://dl.acm.org/doi/10.1145/35962442023RARIKU ARAKAWA et al.Cognitive Impairment & Neurodiversity (Autism, ADHD, Dyslexia)Biosensors & Physiological MonitoringUbiComp
PrISM-Tracker: A Framework for Multimodal Procedure Tracking Using Wearable Sensors and State Transition Information with User-Driven Handling of Errors and UncertaintyA user often needs training and guidance while performing several daily life procedures, e.g., cooking, setting up a new appliance, or doing a COVID test. Watch-based human activity recognition (HAR) can track users' actions during these procedures. However, out of the box, state-of-the-art HAR struggles from noisy data and less-expressive actions that are often part of daily life tasks. This paper proposes PrISM-Tracker, a procedure-tracking framework that augments existing HAR models with (1) graph-based procedure representation and (2) a user-interaction module to handle model uncertainty. Specifically, PrISM-Tracker extends a Viterbi algorithm to update state probabilities based on time-series HAR outputs by leveraging the graph representation that embeds time information as prior. Moreover, the model identifies moments or classes of uncertainty and asks the user for guidance to improve tracking accuracy. We tested PrISM-Tracker in two procedures: latte-making in an engineering lab study and wound care for skin cancer patients at a clinic. The results showed the effectiveness of the proposed algorithm utilizing transition graphs in tracking steps and the efficacy of using simulated human input to enhance performance. This work is the first step toward human-in-the-loop intelligent systems for guiding users while performing new and complicated procedural tasks. https://dl.acm.org/doi/10.1145/35695042023RARiku Arakawa et al.Human Pose & Activity RecognitionBiosensors & Physiological MonitoringUbiComp
uKnit: A Position-aware Reconfigurable Machine-knitted Wearable for Gestural Interaction and Passive Sensing using Electrical Impedance Tomography A scarf is inherently reconfigurable: wearers often use it as a neck wrap, a shawl, a headband, a wristband, and more. We developed uKnit, a scarf-like soft sensor with scarf-like reconfigurability, built with machine knitting and electrical impedance tomography sensing. Soft wearable devices are comfortable and thus attractive for many human-computer interaction scenarios. While prior work has demonstrated various soft wearable capabilities, each capability is device- and location-specific, being incapable of meeting users' various needs with a single device. In contrast, uKnit explores the possibility of one-soft-wearable-for-all. We describe the fabrication and sensing principles behind uKnit, demonstrate several example applications, and evaluate it with 10-participant user studies and a washability test. uKnit achieves 88.0%/78.2% accuracy for 5-class worn-location detection and 80.4%/75.4% accuracy for 7-class gesture recognition with a per-user/universal model. Moreover, it identifies respiratory rate with an error rate of 1.25 bpm and detects binary sitting postures with an average accuracy of 86.2%.2023TYTianhong Catherine Yu et al.Carnegie Mellon UniversityElectrical Muscle Stimulation (EMS)Haptic WearablesHuman Pose & Activity RecognitionCHI
CatAlyst: Domain-Extensible Intervention for Preventing Task Procrastination Using Large Generative ModelsCatAlyst uses generative models to help workers’ progress by influencing their task engagement instead of directly contributing to their task outputs. It prompts distracted workers to resume their tasks by generating a continuation of their work and presenting it as an intervention that is more context-aware than conventional (predetermined) feedback. The prompt can function by drawing their interest and lowering the hurdle for resumption even when the generated continuation is insufficient to substitute their work, while recent human-AI collaboration research aiming at work substitution depends on a stable high accuracy. This frees CatAlyst from domain-specific model-tuning and makes it applicable to various tasks. Our studies involving writing and slide-editing tasks demonstrated CatAlyst’s effectiveness in helping workers swiftly resume tasks with a lowered cognitive load. The results suggest a new form of human-AI collaboration where large generative models publicly available but imperfect for each individual domain can contribute to workers’ digital well-being.2023RARiku Arakawa et al.Carnegie Mellon UniversityHuman-LLM CollaborationNotification & Interruption ManagementWorkplace Wellbeing & Work StressCHI
IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and EarbudsTracking body pose on-the-go could have powerful uses in fitness, mobile gaming, context-aware virtual assistants, and rehabilitation. However, users are unlikely to buy and wear special suits or sensor arrays to achieve this end. Instead, in this work, we explore the feasibility of estimating body pose using IMUs already in devices that many users own --- namely smartphones, smartwatches, and earbuds. This approach has several challenges, including noisy data from low-cost commodity IMUs, and the fact that the number of instrumentation points on a user's body is both sparse and in flux. Our pipeline receives whatever subset of IMU data is available, potentially from just a single device, and produces a best-guess pose. To evaluate our model, we created the IMUPoser Dataset, collected from 10 participants wearing or holding off-the-shelf consumer devices and across a variety of activity contexts. We provide a comprehensive evaluation of our system, benchmarking it on both our own and existing IMU datasets.2023VMVimal Mollyn et al.Carnegie Mellon UniversityHuman Pose & Activity RecognitionBiosensors & Physiological MonitoringCHI
BeParrot: Efficient Interface for Transcribing Unclear Speech via RespeakingTranscribing speech from audio files to text is an important task not only for exploring the audio content in text form but also for utilizing the transcribed data as a source to train speech models, e.g., automated speech recognition (ASR) models. A post-correction approach has been frequently employed to reduce the time cost of transcription where users edit errors in the recognition results of ASR models. However, this approach assumes clear speech and is not designed for unclear speech (e.g., speech with high levels of noise or reverberation), which severely degrades the accuracy of ASR and requires many manual corrections. To construct an alternative approach to transcribe unclear speech, we introduce the idea of respeaking, which has primarily been used to create captions for television programs in real time. In respeaking, a proficient human respeaker repeats the heard speech as shadowing, and their utterances are recognized by an ASR model. While this approach can be effective for transcribing unclear speech, one problem is that respeaking is a highly cognitively demanding task and extensive training is often required to become a respeaker. We address this point with BeParrot, the first interface designed for respeaking that allows novice users to benefit from respeaking without extensive training through two key features, i.e, parameter adjustment and pronunciation feedback. Our user study involving 60 crowd workers demonstrated that they could transcribe different types of unclear speech 32.2 % faster with BeParrot than with a conventional approach without losing the accuracy of transcriptions. In addition, comments from the workers supported the design of the adjustment and feedback features, exhibiting a willingness to continue using BeParrot for transcription tasks. Our work demonstrates how we can leverage recent advances in machine learning techniques to overcome the area that is still challenging for computers themselves with the help of a human-in-the-loop approach.2022RARiku Arakawa et al.Intelligent Voice Assistants (Alexa, Siri, etc.)Conversational ChatbotsIUI
VocabEncounter: NMT-powered Vocabulary Learning by Presenting Computer-Generated Usages of Foreign Words into Users' Daily LivesWe demonstrate that recent natural language processing (NLP) techniques introduce a new paradigm of vocabulary learning that benefits from both micro and usage-based learning by generating and presenting the usages of foreign words based on the learner's context. Then, without allocating dedicated time for studying, the user can become familiarized with how the words are used by seeing the example usages during daily activities, such as Web browsing. To achieve this, we introduce VocabEncounter, a vocabulary-learning system that suitably encapsulates the given words into materials the user is reading in near real time by leveraging recent NLP techniques. After confirming the system's human-comparable quality of generating translated phrases by involving crowdworkers, we conducted a series of user studies, which demonstrated its effectiveness on learning vocabulary and its favorable experiences. Our work shows how NLP-based generation techniques can transform our daily activities into a field for vocabulary learning.2022RARiku Arakawa et al.Carnegie Mellon UniversityGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationProgramming Education & Computational ThinkingCHI