MiniMates: Miniature Avatars for AR Remote Meetings within Limited Physical SpacesRemote meetings using 3D avatars in Augmented Reality (AR) allow effective communication and enable users to retain awareness of their surroundings. However, positioning 3D avatars effectively and consistently for all users in AR is challenging since most spaces, such as offices or living rooms, are not large enough to accommodate multiple life-sized avatars without interference. To address this issue, we contribute MiniMates---a novel approach leveraging miniature avatars, which make it possible to place multiple remote users in a limited physical space. We see MiniMates as complementary to traditional 2D video conferencing and immersive telepresence. Our approach automatically adjusts the formation of avatars and redirects users' head and body orientation to facilitate communication. Results from our user study (n = 24) show that participants experience a higher sense of co-presence compared to video conferencing, and that MiniMates enabled them to communicate the direction of their interactions non-verbally as well as manage multiple simultaneous conversations.2025AKAkihiro Kiuchi et al.The University of TokyoSocial & Collaborative VRMixed Reality WorkspacesContext-Aware ComputingCHI
SpineLoft: Interactive Spine-based 2D-to-3D Modeling3D artists (professionals and novices alike) often take inspiration from sketches or photos to guide their designs. Yet, existing modeling systems are not tailored to fully make use of such input. Consequently, significant effort and expertise are needed when creating model prototypes or exploring design options. In this work, we introduce a system to support the exploratory modeling process by enabling the transformation of 2D image elements into geometric 3D objects. Our solution relies on a novel d2 distance function, supporting a region-based lofting process, and delivers easily-editable 3D geometric "spine-rib" representations. The user draws a spine, and the system generates and modifies a generalized cylinder around it, considering image edges. The proposed approach, driven by simple user-defined scribble definitions, can robustly handle various image sources, ranging from photos to hand-drawn content.2025ATAlexandre Thiault et al.Institut Polytechnique de Paris, Telecom Paris3D Modeling & AnimationCustomizable & Personalized ObjectsCHI
FontCraft: Multimodal Font Design Using Interactive Bayesian OptimizationCreating new fonts requires a lot of human effort and professional typographic knowledge. Despite the rapid advancements of automatic font generation models, existing methods require users to prepare pre-designed characters with target styles using font-editing software, which poses a problem for non-expert users. To address this limitation, we propose FontCraft, a system that enables font generation without relying on pre-designed characters. Our approach integrates the exploration of a font-style latent space with human-in-the-loop preferential Bayesian optimization and multimodal references, facilitating efficient exploration and enhancing user control. Moreover, FontCraft allows users to revisit previous designs, retracting their earlier choices in the preferential Bayesian optimization process. Once users finish editing the style of a selected character, they can propagate it to the remaining characters and further refine them as needed. The system then generates a complete outline font in OpenType format. We evaluated the effectiveness of FontCraft through a user study comparing it to a baseline interface. Results from both quantitative and qualitative evaluations demonstrate that FontCraft enables non-expert users to design fonts efficiently.2025YTYuki Tatsukawa et al.The University of Tokyo, Igarashi LabGraphic Design & Typography ToolsCustomizable & Personalized ObjectsCHI
XR-penter: Material-Aware and In Situ Design of Scrap Wood AssembliesWoodworkers have to navigate multiple considerations when planning a project, including available resources, skill-level, and intended effort. Do it yourself (DIY) woodworkers face these challenges most acutely because of tight material constraints and a desire for custom designs tailored to specific spaces. To address these needs, we present XR-penter, an extended reality (XR) application that supports in situ, material-aware woodworking for casual makers. Our system enables users to design virtual scrap wood assemblies directly in their workspace, encouraging sustainable practices through the use of discarded materials. Users register physical material as virtual twins, manipulate these twins into an assembly in XR (while receiving feedback on material usage and alignment with their surroundings), and preview cuts needed for fabrication. We conducted a case study and feedback sessions demonstrating that XR-penter supports improvisational workflows in practice, and found that woodworkers who prioritize material-driven and adaptive workflows would benefit most from our system.2025RIRamya Iyer et al.Georgia Institute of TechnologyMixed Reality WorkspacesShape-Changing Materials & 4D PrintingCHI
Draw2Cut: Direct On-Material Annotations for CNC MillingCreating custom artifacts with computer numerical control (CNC) milling machines typically requires mastery of complex computer-aided design (CAD) software. To eliminate this user barrier, we introduced Draw2Cut, a novel system that allows users to design and fabricate artifacts by sketching directly on physical materials. Draw2Cut employs a custom-drawing language to convert user-drawn lines, symbols, and colors into toolpaths, thereby enabling users to express their creative intent intuitively. The key features include real-time alignment between material and virtual toolpaths, a preview interface for validation, and an open-source platform for customization. Through technical evaluations and user studies, we demonstrate that Draw2Cut lowers the entry barrier for personal fabrication, enabling novices to create customized artifacts with precision and ease. Our findings highlight the potential of the system to enhance creativity, engagement, and accessibility in CNC-based woodworking.2025XGXinyue Gui et al.The University of TokyoDesktop 3D Printing & Personal FabricationCustomizable & Personalized ObjectsCHI
CompAct: Designing Interconnected Compliant Mechanisms with Targeted Actuation TransmissionsCompliant mechanisms enable the creation of compact and easy-to-fabricate devices for tangible interaction. This work explores interconnected compliant mechanisms consisting of multiple joints and rigid bodies to transmit and process displacements as signals that result from physical interactions. As these devices are difficult to design due to their vast and complex design space, we developed a graph-based design algorithm and computational tool to help users program and customize such computational functions and procedurally model physical designs. When combined with active materials with actuation and sensing capabilities, these devices can also render and detect haptic interaction. Our design examples demonstrate the tool’s capability to respond to relevant HCI concepts, including building modular physical interface toolkits, encrypting tangible interactions, and customizing user augmentation for accessibility. We believe the tool will facilitate the generation of new interfaces with enriched affordance.2025HYHumphrey Yang et al.Carnegie Mellon University, Human-Computer Interaction InstituteShape-Changing Interfaces & Soft Robotic MaterialsCustomizable & Personalized ObjectsCHI
Proactive Conversational Agents with Inner ThoughtsOne of the long-standing aspirations in conversational AI is to allow them to autonomously take initiatives in conversations, i.e. being proactive. This is especially challenging for multi-party conversations. Prior NLP research focused mainly on predicting the next speaker from contexts like preceding conversations. In this paper, we demonstrate the limitations of such methods and rethink what it means for AI to be proactive in multi-party, human-AI conversations.We propose that just like humans, rather than merely reacting to turn-taking cues, a proactive AI formulates its own inner thoughts during a conversation, and seeks the right moment to contribute. Through a formative study with 24 participants and inspiration from linguistics and cognitive psychology, we introduce the Inner Thoughts framework. Our framework equips AI with a continuous, covert train of thoughts in parallel to the overt communication process, which enables it to proactively engage by modeling its intrinsic motivation to express these thoughts. We instantiated this framework into two real-time systems: an AI playground web app and a chatbot. Through a technical evaluation and user studies with human participants, our framework significantly surpasses existing baselines on aspects like anthropomorphism, coherence, intelligence, and turn-taking appropriateness.2025XLXingyu "Bruce" Liu et al.UCLA, HCI ResearchConversational ChatbotsAgent Personality & AnthropomorphismHuman-LLM CollaborationCHI
Shrinkable Arm-based eHMI on Autonomous Delivery Vehicle for Effective Communication with Other Road UsersWhen employing autonomous driving technology in logistics, small autonomous delivery vehicles (aka delivery robots) encounter challenges different from passenger vehicles when interacting with other road users. We conducted an online video survey as a pre-study and found that autonomous delivery vehicles need external human-machine interfaces (eHMIs) to ask for help due to their small size and functional limitations. Inspired by everyday human communication, we chose arms as eHMI to show their request through limb motion and gesture. We held an in-house workshop to identify the arm’s requirements for designing a specific arm with shrink-ability (conspicuous when delivering messages but not affect traffic at other times). We prototyped a small delivery robot with a shrinkable arm and filmed the experiment videos. We conducted two studies (a video-based and a 360-degree-photo VR-based) with 18 participants. We demonstrated that arm-on-delivery robots can increase interaction efficiency by drawing more attention and communicating specific information.2024XGXinyue Gui et al.External HMI (eHMI) — Communication with Pedestrians & CyclistsAutoUI
MR Microsurgical Suture Training System with Level-Appropriate SupportThe integration of advanced technologies in healthcare necessitates the development of systems accommodating the daily routines in medical practices. Neurosurgeons, in particular, require extensive practice in microsurgical suturing in the long term, even in the busy routine of a medical practice. This study collaboratively developed a Mixed Reality system with neurosurgeons to support self-training in microscopic suturing. Based on the neurosurgeons' opinions, we implemented a level-appropriate microsurgical suture training system. For novices, the system offers shadow-matching training to support the practice of precise movements under the high-sensitivity environment of the microscope. For intermediates, it provides a real-time feedback system, which allows users to practice attention to details. Evaluation involved testing the novice system on students with no medical background and the intermediate system on neurosurgery residents. The effectiveness of the system was demonstrated through the experimental results and subsequent discussion.2024YTYuka Tashiro et al.Tokyo Institute of TechnologyMixed Reality WorkspacesVR Medical Training & RehabilitationRobots in Education & HealthcareCHI
iPose: Interactive Human Pose Reconstruction from VideoReconstructing 3D human poses from video has wide applications, such as character animation and sports analysis. Automatic 3D pose reconstruction methods have demonstrated promising results, but failure cases can still appear due to the diversity of human actions, capturing conditions, and depth ambiguities. Thus, manual intervention remains indispensable, which can be time-consuming and require professional skills. We thus present iPose, an interactive tool that facilitates intuitive human pose reconstruction from a given video. Our tool incorporates both human perception in specifying pose appearance to achieve controllability, and video frame processing algorithms to achieve precision and automation. A user manipulates the projection of a 3D pose via 2D operations on top of video frames, and the 3D poses are updated correspondingly while satisfying both kinematic and video frame constraints. The pose updates are propagated temporally to reduce user workload. We evaluate the effectiveness of iPose with a user study on the 3DPW dataset and expert interviews.2024JLJingyuan Liu et al.The University of TokyoHuman Pose & Activity Recognition3D Modeling & AnimationCHI
SyncLabeling: A Synchronized Audio Segmentation Interface for Mobile DevicesManual audio segmentation is a time-consuming process, especially when there is more than one sound playing simultaneously that needs to be segmented and annotated (e.g., target and background sounds). In conventional audio annotation interfaces, users need to repeatedly pause and replay the audio to complete an overlap segmentation task, which is very inefficient. In this paper, we propose “SyncLabeling,” a synchronized audio segmentation interface for smartphones that allows users to segment and annotate two overlapping sounds in a single audio stream at a time using a game-like labeling interface on mobile devices. We conducted a user study to compare the proposed SyncLabeling interface with a conventional audio annotation interface on four types of audio segmentation tasks. The results showed that the proposed interface is much more efficient than the conventional interface (2.4× faster) under comparable annotation accuracy in most tasks. In addition, more than half of the participants enjoyed using the proposed SyncLabeling interface and showed willingness to use it.2023YTYi Tang et al.Gamification DesignMobileHCI
ODEN: Live Programming for Neural Network Architecture EditingIn deep learning application development, programmers tend to be trying different architectures and hyper-parameters until satisfied with the model performance. Although programmers may want to smoothly go back and forth between neural network(NN) architecture editing and experimentation, program crashes due to tensor shape mismatch and other issues prohibit them, especially novice programmers, from doing so. We propose to leverage live programming techniques in NN architecture editing to show an always-on visualization. When the user edits the program, the visualization can synchronously display tensor states and provide a warning message by continuously executing the program to prevent program crashes during experimentation. We implement the live visualization and integrate it into an IDE called ODEN that seamlessly supports the “edit→experiment→edit→···” repetition. With ODEN, the user can construct the neural network with the live visualization and transits into experimentation to instantly train and test the NN architecture. An exploratory user study is conducted to evaluate the usability, the limitations, and the potential of live visualization in ODEN.2022CZChunqi Zhao et al.Prototyping & User TestingComputational Methods in HCIIUI
Per Garment Capture and Synthesis for Real-time Virtual Try-onVirtual try-on is a promising application of computer graphics and human computer interaction that can have a profound real-world impact especially during this pandemic. Existing image-based works try to synthesize a try-on image from a single image of a target garment, but it inherently limits the ability to react to possible interactions. It is difficult to reproduce the change of wrinkles caused by pose and body size change, as well as pulling and stretching of the garment by hand. In this paper, we propose an alternative per garment capture and synthesis workflow to handle such rich interactions by training the model with many systematically captured images. Our workflow is composed of two parts: garment capturing and clothed person image synthesis. We designed an actuated mannequin and an efficient capturing process that collects the detailed deformations of the target garments under diverse body sizes and poses. Furthermore, we proposed to use a custom-designed measurement garment, and we captured paired images of the measurement garment and the target garments. We then learn a mapping between the measurement garment and the target garments using deep image-to-image translation. The customer can then try on the target garments interactively during online shopping. The proposed workflow requires certain manual labor, but we believe that the cost is acceptable given that the retailers are already paying significant costs for hiring professional photographers and models, stylists, and editors to take photographs for promotion. Our method can remove the need of hiring these costly professionals. We evaluated the effectiveness of the proposed system with ablation studies and quality comparison with previous virtual try-on methods. We perform a user study to show our promising virtual try-on performances. Moreover, we also demonstrate that we use our method for changing virtual costumes in video conferences. Finally, we provide the collected dataset as the cloth dataset parameterized by various viewing angles, body poses, and sizes.2021TCToby Chong et al.Full-Body Interaction & Embodied InputAR Navigation & Context AwarenessMixed Reality WorkspacesUIST
Interactive Hyperparameter Optimization with Paintable TimelinesWe propose a method to integrate more interactivity into automatic hyperparameter optimization systems to leverage the user's prior knowledge on parameter distribution. In our method, the user continuously observes automatic optimization's progress and dynamically specifies where to search in the parameter space. We present a prototype implementation of an interactive dashboard for an optimizer to show our method's feasibility. The interactive dashboard's main feature is ``paintable timeline'' where the user can not only observe the past parameter values tested as in standard timeline but also specify the range of future parameters to be tested with simple painting operations. We show three examples where user intervention might improve the performance of automatic optimizations. We run a user study with experts and the results show that, with prior knowledge about parameter distribution of the target problem, interactive optimization can reach better results compared to fully automatic optimization.2021KHKeita Higuchi et al.Human-LLM CollaborationAutoML InterfacesDIS
Data-centric disambiguation for data transformation with programming-by-exampleProgramming-by-example (PBE), can be a powerful tool to reduce manual work in repetitive data transformation tasks. However, few examples often leave ambiguity and may cause undesirable data transformation by the system. This ambiguity can be resolved by allowing the user to directly edit the synthesized programs; however, this is difficult for non-programmers. Here, we present a novel approach: data-centric disambiguation for data transformation, where users resolve the ambiguity in data transformation by examining and modifying the output rather than the program. The key idea is to focus on the given set of data the user wants to transform instead of pursuing the synthesized program's generality or completeness. Our system provides visualization and interaction methods that allow users to efficiently examine and fix the transformed outputs, which is much simpler than understanding and modifying the program itself. The user study suggests that our system can successfully help non-programmers to more easily and efficiently process data.2021MNMinori Narita et al.Interactive Data VisualizationIUI
Interactive Exploration-Exploitation Balancing for Generative Melody CompositionRecent content creation systems allow users to generate various high-quality content (e.g., images, 3D models, and melodies) by just specifying a parameter set (e.g., a latent vector of a deep generative model). The task here is to search for an appropriate parameter set that produces the desired content. To facilitate this task execution, researchers have investigated user-in-the-loop optimization, where the system samples candidate solutions, asks the user to provide preferential feedback on them, and iterates this procedure until finding the desired solution. In this work, we investigate a novel approach to enhance this interactive process: allowing users to control the sampling behavior. More specifically, we allow users to adjust the balance between exploration (i.e., favoring diverse samples) and exploitation (i.e., favoring focused samples) in each iteration. To evaluate how this approach affects the user experience and optimization behavior, we implement it into a melody composition system that combines a deep generative model with Bayesian optimization. Our experiments suggest that this approach could improve the user's engagement and optimization performance.2021YZYijun Zhou et al.Generative AI (Text, Image, Music, Video)Music Composition & Sound Design ToolsCreative Collaboration & Feedback SystemsIUI
Exploring a Makeup Support System for Transgender Passing based on Automatic Gender RecognitionHow to handle gender with machine learning is a controversial topic. A growing critical body of research brought attention to the numerous issues transgender communities face with the adoption of current automatic gender recognition (AGR) systems. In contrast, we explore how such technologies could potentially be appropriated to support transgender practices and needs, especially in non-Western contexts like Japan. We designed a virtual makeup probe to assist transgender individuals with passing, that is to be perceived as the gender they identify as. To understand how such an application might support expressing transgender individuals gender identity or not, we interviewed 15 of them in Tokyo and found that in the right context and under strict conditions, AGR based systems could assist transgender passing.2021TCToby Chong et al.The University of TokyoGender & Race Issues in HCIEmpowerment of Marginalized GroupsCHI
Spatial Labeling: Leveraging Spatial Layout for Improving Label Quality in Non-Expert Image AnnotationNon-expert annotators (who lack sufficient domain knowledge) are often recruited for manual image labeling tasks owing to the lack of expert annotators. In such a case, label quality may be relatively low. We propose leveraging the spatial layout for improving label quality in non-expert image annotation. In the proposed system, an annotator first spatially lays out the incoming images and labels them on an open space, placing related items together. This serves as a working space (spatial organization) for tentative labeling. During the process, the annotator observes and organizes the similarities and differences between the items. Finally, the annotator provides definitive labels to the images based on the results of the spatial layout. We ran a user study comparing the proposed method and a traditional non-spatial layout in an image labeling task. The results demonstrated that annotators can complete the labeling tasks more accurately using the spatial layout interface than the non-spatial layout interface.2021CCZekun Chang et al.The University of TokyoInteractive Data VisualizationCrowdsourcing Task Design & Quality ControlCHI
Tsugite: Interactive Design and Fabrication of Wood JointsWe present Tsugite—an interactive system for designing and fabricating wood joints for frame structures. To design and manually craft such joints is difficult and time consuming. Our system facilitates the creation of custom joints by a modeling interface combined with computer numerical control (CNC) fabrication. The design space is a 3D grid of voxels that enables efficient geometrical analysis and combinatorial search. The interface has two modes: manual editing and gallery. In the manual editing mode, the user edits a joint while receiving real-time graphical feedback and suggestions provided based on performance metrics including slidability, fabricability, and durability with regard to the direction of fiber. In the gallery mode, the user views and selects feasible joints that have been pre-calculated. When a joint design is finalized, it can be manufactured with a 3-axis CNC milling machine using a specialized path planning algorithm that ensures joint assemblability by corner rounding. This system was evaluated via a user study and by designing and fabricating joint samples and functional furniture.2020MLMaria Larsson et al.Desktop 3D Printing & Personal FabricationLaser Cutting & Digital FabricationUIST
A Survey on Interactive Reinforcement Learning: Design Principles and Open ChallengesInteractive reinforcement learning (RL) has been successfully used in various applications in different fields, which has also motivated HCI researchers to contribute in this area. In this paper, we survey interactive RL to empower human-computer interaction (HCI) researchers with the technical background in RL needed to design new interaction techniques and propose new applications. We elucidate the roles played by HCI researchers in interactive RL, identifying ideas and promising research directions. Furthermore, we propose generic design principles that will provide researchers with a guide to effectively implement interactive RL applications.2020CCChristian Arzate Cruz et al.Human-LLM CollaborationAI-Assisted Decision-Making & AutomationDIS