Tweeq: Parameter-Tuning GUI Widgets by/for Creative ProfessionalsProfessionals in the creative industry rely on digital content authoring tools that provide graphical user interface (GUI) widgets for tuning primitive values, such as numeric sliders, rotary knobs, and color pickers. Despite their prevalence, GUI studies have sometimes been considered “done” in the context of HCI, and users frustrations with these widgets might have gone unheard. We sampled such widgets from popular production software and analyzed their interaction design, identifying three core design principles: support diverse input modalities to match users’ nuanced control strategies, prioritize high-speed and accurate interaction for skilled users, and minimize visual footprint to preserve the creative workspace. We then provide reference implementations of GUI widgets that follow these guidelines, named Tweeq. Developed in parallel with the author’s animation projects, these widgets allow parameter tuning with fewer clicks, provide continuous visual feedback through overlays, and facilitate gestural and keyboard inputs that do not require constant visual attention. To evaluate Tweeq, we implemented example applications and conducted an informal expert user study, which revealed generally positive reactions and indicated that the proposed design principles may offer a useful direction for GUI development in creative software.2025BHBaku Hashimoto et al.360° Video & Panoramic ContentGraphic Design & Typography ToolsUIST
Cooperative Design Optimization through Natural Language InteractionDesigning successful interactions requires identifying optimal design parameters. To do so, designers often conduct iterative user testing and exploratory trial-and-error. This involves balancing multiple objectives in a high-dimensional space, making the process time-consuming and cognitively demanding. System-led optimization methods, such as those based on Bayesian optimization, can determine for designers which parameters to test next. However, they offer limited opportunities for designers to intervene in the optimization process, negatively impacting the designer’s experience. We propose a design optimization framework that enables natural language interactions between designers and the optimization system, facilitating cooperative design optimization. This is achieved by integrating system-led optimization methods with Large Language Models (LLMs), allowing designers to intervene in the optimization process and better understand the system's reasoning. Experimental results show that our method provides higher user agency than a system-led method and shows promising optimization performance compared to manual design. It also matches the performance of an existing cooperative method with lower cognitive load.2025RNRyogo Niwa et al.Human-LLM CollaborationAI-Assisted Decision-Making & AutomationUIST
FontCraft: Multimodal Font Design Using Interactive Bayesian OptimizationCreating new fonts requires a lot of human effort and professional typographic knowledge. Despite the rapid advancements of automatic font generation models, existing methods require users to prepare pre-designed characters with target styles using font-editing software, which poses a problem for non-expert users. To address this limitation, we propose FontCraft, a system that enables font generation without relying on pre-designed characters. Our approach integrates the exploration of a font-style latent space with human-in-the-loop preferential Bayesian optimization and multimodal references, facilitating efficient exploration and enhancing user control. Moreover, FontCraft allows users to revisit previous designs, retracting their earlier choices in the preferential Bayesian optimization process. Once users finish editing the style of a selected character, they can propagate it to the remaining characters and further refine them as needed. The system then generates a complete outline font in OpenType format. We evaluated the effectiveness of FontCraft through a user study comparing it to a baseline interface. Results from both quantitative and qualitative evaluations demonstrate that FontCraft enables non-expert users to design fonts efficiently.2025YTYuki Tatsukawa et al.The University of Tokyo, Igarashi LabGraphic Design & Typography ToolsCustomizable & Personalized ObjectsCHI
Inter(sectional) Alia(s): Ambiguity in Voice Agent Identity via Intersectional Japanese Self-ReferentsConversational agents that mimic people have raised questions about the ethics of anthropomorphizing machines with human social identity cues. Critics have also questioned assumptions of identity neutrality in humanlike agents. Recent work has revealed that intersectional Japanese pronouns can elicit complex and sometimes evasive impressions of agent identity. Yet, the role of other ``neutral'' non-pronominal self-referents (NPSR) and voice as a socially expressive medium remains unexplored. In a crowdsourcing study, Japanese participants (N=204) evaluated three ChatGPT voices (Juniper, Breeze, and Ember) using seven self-referents. We found strong evidence of voice gendering alongside the potential of intersectional self-referents to evade gendering, i.e., ambiguity through neutrality and elusiveness. Notably, perceptions of age and formality intersected with gendering as per sociolinguistic theories, especially ぼく (boku) and わたくし (watakushi). This work provides a nuanced take on agent identity perceptions and champions intersectional and culturally-sensitive work on voice agents.2025TFTakao Fujii et al.Institute of Science Tokyo, Department of Industrial Engineering and EconomicsIntelligent Voice Assistants (Alexa, Siri, etc.)Multilingual & Cross-Cultural Voice InteractionAgent Personality & AnthropomorphismCHI
Super Kawaii Vocalics: Amplifying the “Cute” Factor in Computer Voice"Kawaii" is the Japanese concept of cute, which carries sociocultural connotations related to social identities and emotional responses. Yet, virtually all work to date has focused on the visual side of kawaii, including in studies of computer agents and social robots. In pursuit of formalizing the new science of kawaii vocalics, we explored what elements of voice relate to kawaii and how they might be manipulated, manually and automatically. We conducted a four-phase study (grand 𝑁 = 512) with two varieties of computer voices: text-to-speech (TTS) and game character voices. We found kawaii "sweet spots" through manipulation of fundamental and formant frequencies, but only for certain voices and to a certain extent. Findings also suggest a ceiling effect for the kawaii vocalics of certain voices. We offer empirical validation of the preliminary kawaii vocalics model and an elementary method for manipulating kawaii perceptions of computer voice.2025YMYuto Mandai et al.Tokyo Institute of Technology, Department of Industrial Engineering and EconomicsIntelligent Voice Assistants (Alexa, Siri, etc.)Agent Personality & AnthropomorphismCHI
Griffith: A Storyboarding Tool Designed with Japanese Animation ProfessionalsThe ``E-conte,'' storyboard in English, is commonly referred to as the ``blueprint'' in Japanese animation (anime) production, consisting of scene illustrations, timing information, and textual descriptions. This paper introduces ``Griffith,'' a digital system for creating these storyboards. Due to its highly cultural and domain-specific nature, the tool design entailed an in-depth study of the E-conte process and a longitudinal collaboration with an experienced anime director and producers. The resulting system contributes not only domain knowledge, but also generalizable insights into a creativity support environment for visual storytelling, including the importance of vertical timelines and discrete yet integrated tools. To reflect on the interaction design, we presented Griffith to professionals with diverse roles in anime production. Our findings highlight the benefits of the Griffith user interface and the need for a socio-technical focus in designing creativity support tools.2024JKJun Kato et al.National Institute of Advanced Industrial Science and Technology (AIST), Arch, Inc.Graphic Design & Typography ToolsCreative Collaboration & Feedback SystemsInteractive Narrative & Immersive StorytellingCHI
Lyric App Framework: A Web-based Framework for Developing Interactive Lyric-driven Musical ApplicationsLyric videos have become a popular medium to convey lyrical content to listeners, but they present the same content whenever they are played and cannot adapt to listeners' preferences. Lyric apps, as we name them, are a new form of lyric-driven visual art that can render different lyrical content depending on user interaction and address the limitations of static media. To open up this novel design space for programmers and musicians, we present Lyric App Framework, a web-based framework for building interactive graphical applications that play musical pieces and show lyrics synchronized with playback. We designed the framework to provide a streamlined development experience for building production-ready lyric apps with creative coding libraries of choice. We held programming contests twice and collected 52 examples of lyric apps, enabling us to reveal eight representative categories, confirm the framework's effectiveness, and report lessons learned.2023JKJun Kato et al.National Institute of Advanced Industrial Science and Technology (AIST)AI-Assisted Creative WritingMusic Composition & Sound Design ToolsCHI
CatAlyst: Domain-Extensible Intervention for Preventing Task Procrastination Using Large Generative ModelsCatAlyst uses generative models to help workers’ progress by influencing their task engagement instead of directly contributing to their task outputs. It prompts distracted workers to resume their tasks by generating a continuation of their work and presenting it as an intervention that is more context-aware than conventional (predetermined) feedback. The prompt can function by drawing their interest and lowering the hurdle for resumption even when the generated continuation is insufficient to substitute their work, while recent human-AI collaboration research aiming at work substitution depends on a stable high accuracy. This frees CatAlyst from domain-specific model-tuning and makes it applicable to various tasks. Our studies involving writing and slide-editing tasks demonstrated CatAlyst’s effectiveness in helping workers swiftly resume tasks with a lowered cognitive load. The results suggest a new form of human-AI collaboration where large generative models publicly available but imperfect for each individual domain can contribute to workers’ digital well-being.2023RARiku Arakawa et al.Carnegie Mellon UniversityHuman-LLM CollaborationNotification & Interruption ManagementWorkplace Wellbeing & Work StressCHI
BO as Assistant: Using Bayesian Optimization for Asynchronously Generating Design SuggestionsMany design tasks involve parameter adjustment, and designers often struggle to find desirable parameter value combinations by manipulating sliders back and forth. For such a multi-dimensional search problem, Bayesian optimization (BO) is a promising technique because of its intelligent sampling strategy; in each iteration, BO samples the most effective points considering both exploration (i.e., prioritizing unexplored regions) and exploitation (i.e., prioritizing promising regions), enabling efficient searches. However, existing BO-based design frameworks take the initiative in the design process and thus are not flexible enough for designers to freely explore the design space using their domain knowledge. In this paper, we propose a novel design framework, BO as Assistant, which enables designers to take the initiative in the design process while also benefiting from BO's sampling strategy. The designer can manipulate sliders as usual; the system monitors the slider manipulation to automatically estimate the design goal on the fly and then asynchronously provides unexplored-yet-promising suggestions using BO's sampling strategy. The designer can choose to use the suggestions at any time. This framework uses a novel technique to automatically extract the necessary information to run BO by observing slider manipulation without requesting additional inputs. Our framework is domain-agnostic, demonstrated by applying it to photo color enhancement, 3D shape design for personal fabrication, and procedural material design in computer graphics.2022YKYuki Koyama et al.Generative AI (Text, Image, Music, Video)Human-LLM CollaborationAI-Assisted Decision-Making & AutomationUIST
Photographic Lighting Design with Photographer-in-the-Loop Bayesian OptimizationIt is important for photographers to have the best possible lighting configuration at the time of shooting; otherwise, they need post-processing on images, which may cause artifacts and deterioration. Thus, photographers often struggle to find the best possible lighting configuration by manipulating lighting devices, including light sources and modifiers, in a trial-and-error manner. In this paper, we propose a novel computational framework to support photographers. This framework assumes that every lighting device is programmable; that is, its adjustable parameters (e.g., orientation, intensity, and color temperature) can be set using a program. Using our framework, photographers do not need to learn how the parameter values affect the resulting lighting, and even do not need to determine the strategy of the trial-and-error process; instead, photographers need only concentrate on evaluating which lighting configuration is more desirable among options suggested by the system. The framework is enabled by our novel photographer-in-the-loop Bayesian optimization, which is sample-efficient (i.e., the number of required evaluation steps is small) and which can also be guided by providing a rough painting of the desired lighting configuration if any. We demonstrate how the framework works in both simulated virtual environments and a physical environment, suggesting that it could find pleasing lighting configurations quickly in around 10 iterations. Our user study suggests that the framework enables the photographer to concentrate on the look of captured images rather than the parameters, compared with the traditional manual lighting workflow.2022KYYuki Koyama et al.Generative AI (Text, Image, Music, Video)Photography & Image ProcessingUIST
ODEN: Live Programming for Neural Network Architecture EditingIn deep learning application development, programmers tend to be trying different architectures and hyper-parameters until satisfied with the model performance. Although programmers may want to smoothly go back and forth between neural network(NN) architecture editing and experimentation, program crashes due to tensor shape mismatch and other issues prohibit them, especially novice programmers, from doing so. We propose to leverage live programming techniques in NN architecture editing to show an always-on visualization. When the user edits the program, the visualization can synchronously display tensor states and provide a warning message by continuously executing the program to prevent program crashes during experimentation. We implement the live visualization and integrate it into an IDE called ODEN that seamlessly supports the “edit→experiment→edit→···” repetition. With ODEN, the user can construct the neural network with the live visualization and transits into experimentation to instantly train and test the NN architecture. An exploratory user study is conducted to evaluate the usability, the limitations, and the potential of live visualization in ODEN.2022CZChunqi Zhao et al.Prototyping & User TestingComputational Methods in HCIIUI
BeParrot: Efficient Interface for Transcribing Unclear Speech via RespeakingTranscribing speech from audio files to text is an important task not only for exploring the audio content in text form but also for utilizing the transcribed data as a source to train speech models, e.g., automated speech recognition (ASR) models. A post-correction approach has been frequently employed to reduce the time cost of transcription where users edit errors in the recognition results of ASR models. However, this approach assumes clear speech and is not designed for unclear speech (e.g., speech with high levels of noise or reverberation), which severely degrades the accuracy of ASR and requires many manual corrections. To construct an alternative approach to transcribe unclear speech, we introduce the idea of respeaking, which has primarily been used to create captions for television programs in real time. In respeaking, a proficient human respeaker repeats the heard speech as shadowing, and their utterances are recognized by an ASR model. While this approach can be effective for transcribing unclear speech, one problem is that respeaking is a highly cognitively demanding task and extensive training is often required to become a respeaker. We address this point with BeParrot, the first interface designed for respeaking that allows novice users to benefit from respeaking without extensive training through two key features, i.e, parameter adjustment and pronunciation feedback. Our user study involving 60 crowd workers demonstrated that they could transcribe different types of unclear speech 32.2 % faster with BeParrot than with a conventional approach without losing the accuracy of transcriptions. In addition, comments from the workers supported the design of the adjustment and feedback features, exhibiting a willingness to continue using BeParrot for transcription tasks. Our work demonstrates how we can leverage recent advances in machine learning techniques to overcome the area that is still challenging for computers themselves with the help of a human-in-the-loop approach.2022RARiku Arakawa et al.Intelligent Voice Assistants (Alexa, Siri, etc.)Conversational ChatbotsIUI
Interactive Exploration-Exploitation Balancing for Generative Melody CompositionRecent content creation systems allow users to generate various high-quality content (e.g., images, 3D models, and melodies) by just specifying a parameter set (e.g., a latent vector of a deep generative model). The task here is to search for an appropriate parameter set that produces the desired content. To facilitate this task execution, researchers have investigated user-in-the-loop optimization, where the system samples candidate solutions, asks the user to provide preferential feedback on them, and iterates this procedure until finding the desired solution. In this work, we investigate a novel approach to enhance this interactive process: allowing users to control the sampling behavior. More specifically, we allow users to adjust the balance between exploration (i.e., favoring diverse samples) and exploitation (i.e., favoring focused samples) in each iteration. To evaluate how this approach affects the user experience and optimization behavior, we implement it into a melody composition system that combines a deep generative model with Bayesian optimization. Our experiments suggest that this approach could improve the user's engagement and optimization performance.2021YZYijun Zhou et al.Generative AI (Text, Image, Music, Video)Music Composition & Sound Design ToolsCreative Collaboration & Feedback SystemsIUI
A Computational Approach to Magnetic Force Feedback DesignWe present a computational approach to haptic design embedded in everyday tangible interaction with digital fabrication. To generate haptic feedback, the use of permanent magnets as the mechanism potentially contributes to simpleness and robustness; however, it is difficult to manually design how magnets should be embedded in the objects. Our approach enables the inverse design of magnetic force feedback; that is, we computationally solve an inverse problem to obtain an optimal arrangement of permanent magnets that renders the user-specified haptic sensation. To solve the inverse problem in a practical manner, we also present techniques on magnetic simulation and optimization. We demonstrate applications to explore the design possibility of augmenting digital fabrication for everyday use.2021MOMasa Ogata et al.National Institute of Advanced Industrial Science and Technology (AIST)Shape-Changing Interfaces & Soft Robotic MaterialsCircuit Making & Hardware PrototypingCHI
Grand Challenges in Immersive AnalyticsImmersive Analytics is a quickly evolving field that unites several areas such as visualisation, immersive environments, and human-computer interaction to support human data analysis with emerging technologies. This research has thrived over the past years with multiple workshops, seminars, and a growing body of publications, spanning several conferences. Given the rapid advancement of interaction technologies and novel application domains, this paper aims toward a broader research agenda to enable widespread adoption. We present 17 key research challenges developed over multiple sessions by a diverse group of 24 international experts, initiated from a virtual scientific workshop at ACM CHI 2020. These challenges aim to coordinate future work by providing a systematic roadmap of current directions and impending hurdles to facilitate productive and effective applications for Immersive Analytics.2021BEBarrett Ens et al.Monash UniversityImmersion & Presence ResearchInteractive Data VisualizationCHI
FocusMusicRecommender: A System for Recommending Music to Listen to While WorkingThis paper proposes FocusMusicRecommender, an automated system recommending background music to listen to while working. Recommendation systems matching user preferences have been widely researched even though research has shown that music that listeners strongly like is not suitable background music because it interferes with their concentration. FocusMusicRecommender plays songs that users may "neither like nor dislike" instead of "like very much." It is designed to by default summarize a song automatically so that users can give "like very much" feedback by pressing a "keep listening" button or "dislike very much" feedback by pressing a "skip" button. It uses this feedback, along with users' concentration levels estimated from their behavior history, to distinguish between the preference levels "like" and "like very much." It then estimates the preference levels of unplayed songs and selects the most suitable song by considering the user's current concentration level. The effectiveness of the proposed feedback method and suitability of the recommendation results were verified experimentally and in user studies.2018HYHiromu Yakura et al.Recommender System UXIUI
Magneto-Haptics: Embedding Magnetic Force Feedback for Physical InteractionsWe present magneto-haptics, a design approach of haptic sensations powered by the forces present among permanent magnets during active touch. Magnetic force has not been efficiently explored in haptic design because it is not intuitive and there is a lack of methods to associate or visualize magnetic force with haptic sensations, especially for complex magnetic patterns. To represent the haptic sensations of magnetic force intuitively, magneto-haptics formularizes haptic potential from the distribution of magnetic force along the path of motion. It provides a rapid way to compute the relationship between the magnetic phenomena and the haptic mechanism. Thus, we can convert a magnetic force distribution into a haptic sensation model, making the design of magnet-embedded haptic sensations more efficient. We demonstrate three applications of magneto-haptics through interactive interfaces and devices. We further verify our theory by evaluating some magneto-haptic designs through experiments.2018MOMasa OgataVibrotactile Feedback & Skin StimulationShape-Changing Interfaces & Soft Robotic MaterialsUIST
Reactile: Programming Swarm User Interfaces through Direct Physical ManipulationWe explore a new approach to programming swarm user interfaces (Swarm UI) by leveraging direct physical manipulation. Existing Swarm UI applications are written using a robot programming framework: users work on a computer screen and think in terms of low-level controls. In contrast, our approach allows programmers to work in physical space by directly manipulating objects and think in terms of high-level interface design. Inspired by current UI programming practices, we introduce a four-step workflow—create elements, abstract attributes, specify behaviors, and propagate changes—for Swarm UI programming. We propose a set of direct physical manipulation techniques to support each step in this workflow. To demonstrate these concepts, we developed Reactile, a Swarm UI programming environment that actuates a swarm of small magnets and displays spatial information of program states using a DLP projector. Two user studies—an in-class survey with 148 students and a lab interview with eight participants—confirm that our approach is intuitive and understandable for programming Swarm UIs.2018RSRyo Suzuki et al.University of Colorado BoulderHead-Up Display (HUD) & Advanced Driver Assistance Systems (ADAS)Shape-Changing Interfaces & Soft Robotic MaterialsCHI
Asian CHI Symposium: Emerging HCI Research CollectionThis symposium showcases the latest work from Asia on interactive systems and user interfaces that address under-explored problems and demonstrate unique approaches. In addition to circulating ideas and sharing a vision of future research in human-computer interaction, this symposium aims to foster social networks among academics (researchers and students) and practitioners and create a fresh research community from Asian region.2018SSSaki Sakaguchi et al.The University of TokyoDeveloping Countries & HCI for Development (HCI4D)User Research Methods (Interviews, Surveys, Observation)CHI
OptiMo: Optimization-Guided Motion Editing for Keyframe Character AnimationThe mission of animators is to create nuanced, high-quality character motions. To achieve this, the careful editing of animation curves---curves that determine how a series of keyframed poses are interpolated over time---is an important task. Manual editing affords full and precise control, but requires tedious and nonintuitive trials and errors. Numerical optimization can automate such exploration; however, automatic solutions cannot always be perfect, and it is difficult for animators to control optimization owing to its black-box behavior. In this paper, we present a new framework called optimization-guided motion editing, which is aimed at maintaining a sense of full control while utilizing the power of optimization. We have designed interactions and developed a set of mathematical formulations to enable them. We discuss the framework's potential by demonstrating several usage scenarios with our proof-of-concept system, named OptiMo.2018YKYuki Koyama et al.National Institute of Advanced Industrial Science and Technology (AIST)Force Feedback & Pseudo-Haptic Weight3D Modeling & AnimationCHI