WhatIF: Branched Narrative Fiction Visualization for Authoring Emergent Narratives using Large Language ModelsBranched Narrative Fiction (BNF) are non-linear, text based narrative games, where the player of the game is an active participant shaping the story. Unlike linear narratives, BNF allows players to influence the direction, outcomes, and progression of the plot. A narrative fiction developer designs these branching storylines, creating a dynamic interaction between the player and the narrative which requires significant time and skill. In this work we build and investigate the use of a visual analytics tool to help narrative fiction developers generate and plan these parallel worlds within a BNF. We present WhatIF, a visual analytics tool that aids BNF developers to create BNF graphs, edit the graphs, obtain recommendations, visualize differences between storylines and finally verify their BNF on custom metrics. Through a formative study (3 participants) and a user study (11 participants), we observe that WhatIF helps users plan and prototype their BNF, provides avenues to support iterative refinement of narrative and also aids in removing writer's block. Furthermore, we explore how contemporary generative AI (GenAI) tools can empower game developers to build richer and more immersive narratives.2025AMAditi Mishra et al.Generative AI (Text, Image, Music, Video)AI-Assisted Creative WritingC&C
Paratrouper: Exploratory Creation of Character Cast Visuals Using Generative AIGreat characters are critical to the success of many forms of media, such as comics, games, and films. Designing visually compelling casts of characters requires significant skill and consideration, and there is a lack of specialized tools to support this endeavor. We investigate how AI-driven image-generation techniques can empower creatives to explore a variety of visual design possibilities for individual and groups of characters. Informed by interviews with character designers, Paratrouper is a multi-modal system that enables creating and experimenting with multiple permutations for character casts and visualizing them in various contexts as part of a holistic approach to design. We demonstrate how Paratrouper supports different aspects of the character design process, and share insights from its use by eight creators. Our work highlights the interplay between creative agency and serendipity, as well as the visual interrelationships among character aesthetics.2025JLJoanne Leong et al.MIT, MIT Media LabGenerative AI (Text, Image, Music, Video)3D Modeling & AnimationCHI
To Use or Not to Use: Impatience and Overreliance When Using Generative AI Productivity Support ToolsGenerative AI has the potential to assist people with completing various tasks, but increased productivity is not guaranteed due to challenges such as uncertainty in output quality and unclear processing time. Through an online crowdsourced experiment (N=508), leveraging a “paint by numbers” task to simulate properties of GenAI assistance, we explore how, and how well, users make decisions on whether to use or not use automation to maximize their productivity given varying waiting times and output quality. We observed gaps between user’s actual choices and their optimal choices and characterized these gaps as the “gulf of impatience” and the “gulf of overreliance”. We also distilled strategies that participants adopted when making their decisions. We discuss design considerations in supporting users to make more informed decisions when interacting with GenAI tools and make these tools more useful for improving users’ task performance, productivity and satisfaction.2025HQHan Qiao et al.Autodesk ResearchGenerative AI (Text, Image, Music, Video)AI-Assisted Decision-Making & AutomationCHI
AQuA: Automated Question-Answering in Software Tutorial Videos with Visual Anchors Tutorial videos are a popular help source for learning feature-rich software. However, getting quick answers to questions about tutorial videos is difficult. We present an automated approach for responding to tutorial questions. By analyzing 633 questions found in 5,944 video comments, we identified different question types and observed that users frequently described parts of the video in questions. We then asked participants (N=24) to watch tutorial videos and ask questions while annotating the video with relevant visual anchors. Most visual anchors referred to UI elements and the application workspace. Based on these insights, we built AQuA, a pipeline that generates useful answers to questions with visual anchors. We demonstrate this for Fusion 360, showing that we can recognize UI elements in visual anchors and generate answers using GPT-4 augmented with that visual information and software documentation. An evaluation study (N=16) demonstrates that our approach provides better answers than baseline methods.2024SYSaelyne Yang et al.Autodesk Research, School of Computing, KAISTHuman-LLM CollaborationOnline Learning & MOOC PlatformsCHI
SwitchSpace: Understanding Context-Aware Peeking Between VR and Desktop InterfacesCross-reality tasks, like creating or consuming virtual reality (VR) content, often involve inconvenient or distracting switches between desktop and VR. An initial formative study explores cross-reality switching habits, finding most switches are momentary "peeks" between interfaces, with specific habits determined by current context. The results inform a design space for context-aware "peeking" techniques that allow users to view or interact with desktop from VR, and vice versa, without fully switching. We implemented a set of peeking techniques and evaluated them in two levels of a cross-reality task: one requiring only viewing, and another requiring input and viewing. Peeking techniques made task completion faster, with increased input accuracy and reduced perceived workload.2024JWJohann Wentzel et al.University of WaterlooMixed Reality WorkspacesContext-Aware ComputingCHI
TimeTunnel: Integrating Spatial and Temporal Motion Editing for Character Animation in Virtual RealityEditing character motion in Virtual Reality is challenging as it requires working with both spatial and temporal data using controls with multiple degrees of freedom. The spatial and temporal controls are separated, making it difficult to adjust poses over time and predict the effects across adjacent frames. To address this challenge, we propose TimeTunnel, an immersive motion editing interface that integrates spatial and temporal control for 3D character animation in VR. TimeTunnel provides an approachable editing experience via KeyPoses and Trajectories. KeyPoses are a set of representative poses automatically computed to concisely depict motion. Trajectories are 3D animation curves that pass through the joints of KeyPoses to represent in-betweens. TimeTunnel integrates spatial and temporal control by superimposing Trajectories and KeyPoses onto a 3D character. We conducted two studies to evaluate TimeTunnel. In our quantitative study, TimeTunnel reduced the amount of time required for editing motion, and saved effort in locating target poses. Our qualitative study with domain experts demonstrated how TimeTunnel is an approachable interface that can simplify motion editing, while still preserving a direct representation of motion.2024QZQian Zhou et al.Autodesk ResearchImmersion & Presence Research3D Modeling & AnimationCHI
WorldSmith: A Multi-Modal Image Synthesis Tool for Fictional World BuildingCrafting a rich and unique environment is crucial for fictional world-building, but can be difficult to achieve since illustrating a world from scratch requires time and significant skill. We investigate the use of recent multi-modal image generation systems to enable users iteratively visualize and modify elements of their fictional world using a combination of text input, sketching, and region-based filling. WorldSmith enables novice world builders to quickly visualize a fictional world with layered edits and hierarchical compositions. Through a formative study (4 participants) and first-use study (13 participants) we demonstrate that WorldSmith offers more expressive interactions with prompt-based models. With this work, we explore how creatives can be empowered to leverage prompt-based generative AI as a tool in their creative process, beyond current "click-once" prompting UI paradigms.2023HDHai Duong Dang et al.Generative AI (Text, Image, Music, Video)AI-Assisted Creative WritingGraphic Design & Typography ToolsUIST
3DALL-E: Integrating Text-to-Image AI in 3D Design WorkflowsText-to-image AI are capable of generating novel images for inspiration, but their applications for 3D design workflows and how designers can build 3D models using AI-provided inspiration have not yet been explored. To investigate this, we integrated DALL-E, GPT-3, and CLIP within a CAD software in 3DALL-E, a plugin that generates 2D image inspiration for 3D design. 3DALL-E allows users to construct text and image prompts based on what they are modeling. In a study with 13 designers, we found that designers saw great potential in 3DALL-E within their workflows and could use text-to-image AI to produce reference images, prevent design fixation, and inspire design considerations. We elaborate on prompting patterns observed across 3D modeling tasks and provide measures of prompt complexity observed across participants. From our findings, we discuss how 3DALL-E can merge with existing generative design workflows and propose prompt bibliographies as a form of human-AI design history.2023VLVivian Liu et al.Generative AI (Text, Image, Music, Video)Customizable & Personalized ObjectsDIS
Immersive Sampling: Exploring Sampling for Future Creative Practices in Media-Rich, Immersive SpacesCreative practitioners rely on sampling to understand, explore, and construct problems; or gather resources for later use. Despite practitioners' ability to experience immersive environments, sampling from them remains limited to primarily visual captures (e.g., screenshots, videos), which overlook the richness and variety of available media. To address these challenges, we describe ''Immersive Sampling'' as a new way to frame information gathering in the context of immersive environments. In the context of Immersive Sampling, practitioners engage in experiencing immersive environments, while capturing, organizing, revisiting, and remixing found content. We situate this subset of tasks in literature and argue for their importance for emerging, future content creation domains. To further explore how Immersive Sampling might take place, we created VRicolage, a proof-of-concept prototype showcasing a set of interactions in Virtual Reality to sample, revisit, and remix captures. Given the democratization of immersive environments, Immersive Sampling provides practitioners with a means to collect, revisit, and remix digital materials.2023ESEvgeny Stemasov et al.Immersion & Presence ResearchInteractive Narrative & Immersive StorytellingDIS
Tesseract: Querying Spatial Design Recordings by Manipulating Worlds in MiniatureNew immersive 3D design tools enable the creation of spatial design recordings, capturing collaborative design activities. By reviewing captured spatial design sessions, which include user activities, workflows, and tool use, users can reflect on their own design processes, learn new workflows, and understand others' design rationale. However, finding interesting moments in design activities can be challenging: they contain multimodal data (such as user motion and logged events) occurring over time which can be difficult to specify when searching, and are typically distributed over many sessions or recordings. We present Tesseract, a Worlds-in-Miniature-based system to expressively query VR spatial design recordings. Tesseract consists of the Search Cube interface acting as a centralized stage-to-search container, and four querying tools for specifying multimodal data to enable users to find interesting moments in past design activities. We studied ten participants who used Tesseract and found support for our miniature-based stage-to-search approach.2023KMKarthik Mahadevan et al.University of TorontoMixed Reality WorkspacesComputational Methods in HCICHI
AvatAR: An Immersive Analysis Environment for Human Motion Data Combining Interactive 3D Avatars and TrajectoriesAnalysis of human motion data can reveal valuable insights about the utilization of space and interaction of humans with their environment. To support this, we present AvatAR, an immersive analysis environment for the in-situ visualization of human motion data, that combines 3D trajectories, virtual avatars of people’s movement, and a detailed representation of their posture. Additionally, we describe how to embed visualizations directly into the environment, showing what a person looked at or what surfaces they touched, and how the avatar’s body parts can be used to access and manipulate those visualizations. AvatAR combines an AR HMD with a tablet to provide both mid-air and touch interaction for system control, as well as an additional overview to help users navigate the environment. We implemented a prototype and present several scenarios to show that AvatAR can enhance the analysis of human motion data by making data not only explorable, but experienceable.2022PRPatrick Reipschläger et al.Autodesk Research, Technische Universität DresdenHuman Pose & Activity RecognitionSocial & Collaborative VRAR Navigation & Context AwarenessCHI
Supercharging Trial-and-Error for Learning Complex Software ApplicationsDespite an abundance of carefully-crafted tutorials, trial-and-error remains many people’s preferred way to learn complex software. Yet, approaches to facilitate trial-and-error (such as tooltips) have evolved very little since the 1980s. While existing mechanisms work well for simple software, they scale poorly to large feature-rich applications. In this paper, we explore new techniques to support trial-and-error in complex applications. We identify key benefits and challenges of trial-and-error, and introduce a framework with a conceptual model and design space. Using this framework, we developed three techniques: ToolTrack to keep track of trial-and-error progress; ToolTrip to go beyond trial-and-error of single commands by highlighting related commands that are frequently used together; and ToolTaste to quickly and safely try commands. We demonstrate how these techniques facilitate trial-and-error, as illustrated through a proof-of-concept implementation in the CAD software Fusion 360. We conclude by discussing possible scenarios and outline directions for future research on trial-and-error.2022DMDamien Masson et al.Autodesk Research, University of WaterlooHead-Up Display (HUD) & Advanced Driver Assistance Systems (ADAS)Privacy by Design & User ControlKnowledge Worker Tools & WorkflowsCHI
In-Depth Mouse: Integrating Desktop Mouse into Virtual RealityVirtual Reality (VR) has potential for productive knowledge work, however, midair pointing with controllers or hand gestures does not offer the precision and comfort of traditional 2D mice. Directly integrating mice into VR is difficult as selecting targets in a 3D space is negatively impacted by binocular rivalry, perspective mismatch, and improperly calibrated control-display (CD) gain. To address these issues, we developed Depth-Adaptive Cursor, a 2D-mouse driven pointing technique for 3D selection with depth-adaptation that continuously interpolates the cursor depth by inferring what users intend to select based on the cursor position, the viewpoint, and the selectable objects. Depth-Adaptive Cursor uses a novel CD gain tool to compute a usable range of CD gains for general mouse-based pointing in VR. A user study demonstrated that Depth-Adaptive Cursor significantly improved performance compared with an existing mouse-based pointing technique without depth-adaption in terms of time (21.2%), error (48.3%), perceived workload, and user satisfaction.2022QZQian Zhou et al.Autodesk ResearchEye Tracking & Gaze InteractionMixed Reality WorkspacesCHI
"I don't want to feel like I'm working in a 1960s factory": The Practitioner Perspective on Creativity Support Tool AdoptionWith the rapid development of creativity support tools, creative practitioners (e.g., designers, artists, architects) have to constantly explore and adopt new tools into their practice. While HCI research has focused on developing novel creativity support tools, little is known about creative practitioner's values when exploring and adopting these tools. We collect and analyze 23 videos, 13 interviews, and 105 survey responses of creative practitioners reflecting on their values to derive a value framework. We find that practitioners value the tools' functionality, integration into their current workflow, performance, user interface and experience, learning support, costs and emotional connection, in that order. They largely discover tools through personal recommendations. To help unify and encourage reflection from the wider community of CST stakeholders (e.g., systems creators, researchers, marketers, educators), we situate the framework within existing research on systems, creativity support tools and technology adoption.2022SPSrishti Palani et al.Autodesk Research, University of CaliforniaGenerative AI (Text, Image, Music, Video)Creative Collaboration & Feedback SystemsCHI
Designing Co-Creative AI for Virtual Environments Co-creative AI tools provide a method of creative collaboration between a user and machine. One form of co-creative AI called generative design requires the user to input design parameters and wait substantial periods of time while the system computes design solutions. We explore this interaction dynamic by providing an embodied experience in VR. Calliope is a virtual reality (VR) system that enables users to explore and manipulate generative design solutions in real time. Calliope accounts for the typical idle times in the generative design process by using a virtual environment to encourage parallelized and embodied data-exploration and synthesis, while maintaining a tight human-in-the-loop collaboration with the underlying algorithms. In this paper we discuss design considerations informed by formative studies with generative designers and artists, and provide design guidelines to aid others in the development of co-creative AI systems in virtual environments.2021JDJosh Urban Davis et al.Generative AI (Text, Image, Music, Video)Creative Collaboration & Feedback SystemsC&C
Think-Aloud Computing: Supporting Rich and Low-Effort Knowledge CaptureWhen users complete tasks on the computer, the knowledge they leverage and their intent is often lost because it is tedious or challenging to capture. This makes it harder to understand why a colleague designed a component a certain way or to remember requirements for software you wrote a year ago. We introduce think-aloud computing, a novel application of the think-aloud protocol where computer users are encouraged to speak while working to capture rich knowledge with relatively low effort. Through a formative study we find people shared information about design intent, work processes, problems encountered, to-do items, and other useful information. We developed a prototype that supports think-aloud computing by prompting users to speak and contextualizing speech with labels and application context. Our evaluation shows more subtle design decisions and process explanations were captured in think-aloud than via traditional documentation. Participants reported that think-aloud required similar effort as traditional documentation.2021RKRebecca Krosnick et al.University of MichiganKnowledge Worker Tools & WorkflowsPrototyping & User TestingCHI
MakeAware: Designing to Support Situation Awareness in MakerspacesPeople new to making and makerspaces often struggle with identifying what tools are available and where they are, understanding how to operate the tools, and predicting how their decisions will affect their final product. From literature on novices and our interviews with expert makers, we identified situation awareness support as one possible way to address some of the challenges faced by novices. We present a set of design goals intended to scaffold situation awareness in a makerspace, and MakeAware, a prototype system we implemented based on those design goals. MakeAware provides a combination of environmental cues, information about the project process, and background knowledge. In a preliminary evaluation, we found MakeAware can help novices make conscious choices during a project and put more emphasis on planning, thereby exhibiting traits associated with having situation awareness while making.2020JSJessi Stark et al.Context-Aware ComputingCustomizable & Personalized ObjectsDIS
MicroMentor: Peer-to-Peer Software Help Sessions in Three Minutes or LessWhile synchronous one-on-one help for software learning is rich and valuable, it can be difficult to find and connect with someone who can provide assistance. Through a formative user study, we explore the idea of fixed-duration, one-on-one help sessions and find that 3 minutes is often enough time for novice users to explain their problem and receive meaningful help from an expert. To facilitate this type of interaction, we developed MicroMentor, an on-demand help system that connects users via video chat for 3-minute help sessions. MicroMentor automatically attaches relevant supplementary materials and uses contextual information, such as command history and expertise, to encourage the most qualified users to accept incoming requests. These help sessions are recorded and archived, building a bank of knowledge that can further help a broader audience. Through a user study, we find MicroMentor to be useful and successful in connecting users for short teaching moments.2020NJNikhita Joshi et al.Autodesk Research & University of WaterlooCollaborative Learning & Peer TeachingKnowledge Worker Tools & WorkflowsCHI
Instrumenting and Analyzing Fabrication Activities, Users, and ExpertiseThe recent proliferation of fabrication and making activities has introduced a large number of users to a variety of tools and equipment. Monitored, reactive and adaptive fabrication spaces are needed to provide personalized information, feedback and assistance to users. This paper explores the sensorization of making and fabrication activities, where the environment, tools, and users were considered to be separate entities that could be instrumented for data collection. From this exploration, we present the design of a modular system that can capture data from the varied sensors and infer contextual information. Using this system, we collected data from fourteen participants with varying levels of expertise as they performed seven representative making tasks. From the collected data, we predict which activities are being performed, which users are performing the activities, and what expertise the users have. We present several use cases of this contextual information for future interactive fabrication spaces.2019JGJun Gong et al.Autodesk Research & Dartmouth CollegeDesktop 3D Printing & Personal FabricationCircuit Making & Hardware PrototypingComputational Methods in HCICHI
Geppetto: Enabling Semantic Design of Expressive Robot BehaviorsExpressive robots are useful in many contexts, from industrial to entertainment applications. However, designing expressive robot behaviors requires editing a large number of unintuitive control parameters. We present an interactive, data-driven system that allows editing of these complex parameters in a semantic space. Our system combines a physics-based simulation that captures the robot's motion capabilities, and a crowd-powered framework that extracts relationships between the robot's motion parameters and the desired semantic behavior. These relationships enable mixed-initiative exploration of possible robot motions. We specifically demonstrate our system in the context of designing emotionally expressive behaviors. A user-study finds the system to be useful for more quickly developing desirable robot behaviors, compared to manual parameter editing.2019RDRuta Desai et al.Carnegie Mellon UniversitySocial Robot InteractionHuman-Robot Collaboration (HRC)CHI