Adaptique: Multi-objective and Context-aware Online Adaptation of Selection Techniques in Virtual RealitySelection is a fundamental task that is challenging in virtual reality due to issues such as distant and small targets, occlusion, and target-dense environments. Previous research has tackled these challenges through various selection techniques, but complicates selection and can be seen as tedious outside of their designed use case. We present Adaptique, an adaptive model that infers and switches to the most optimal selection technique based on user and environmental information. Adaptique considers contextual information such as target size, distance, occlusion, and user posture combined with four objectives: speed, accuracy, comfort, and familiarity which are based on fundamental predictive models of human movement for technique selection. This enables Adaptique to select simple techniques when they are sufficiently efficient and more advanced techniques when necessary. We show that Adaptique is more preferred and performant than single techniques in a user study, and demonstrate Adaptique’s versatility in an application.2025CLChao-Jung Lai et al.Full-Body Interaction & Embodied InputSocial & Collaborative VRMixed Reality WorkspacesUIST
Viago: Exploring Visual-Audio Modality Transitions for Social Media Consumption on the GoAs mobile phone use while walking becomes increasingly prevalent, users often divide their visual attention between their surroundings and phone displays, raising concerns around safety and interaction efficiency. Alternative input and output modalities, such as eyes-free touch gestures and audio feedback, offer a promising avenue for reducing visual demands in these contexts. However, the design of seamless transitions between visual and audio modalities for mobile interaction on the go remains underexplored. To fill this gap, we conducted a design probe study with ten participants, simulating screen reader-like experiences across diverse applications to identify five key design insights and three design guidelines. Informed by these insights, we developed Viago, a background service that facilitates fluid transitions between visual and audio modalities for mobile task management while walking. A subsequent evaluation with thirteen participants demonstrated that Viago effectively supports on-the-go interactions by enabling users to interleave modalities as needed. We conclude by discussing the broader implications of visual-audio modality transitions and their potential to enhance mobile interactions in everyday, dynamic environments.2025RCYu-Cheng Chang et al.Eye Tracking & Gaze InteractionVoice User Interface (VUI) DesignDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)UIST
StoryEnsemble: Enabling Dynamic Exploration & Iteration in the Design Process with AI and Forward-Backward PropagationDesign processes involve exploration, iteration, and movement across interconnected stages such as persona creation, problem framing, solution ideation, and prototyping. However, time and resource constraints often hinder designers from exploring broadly, collecting feedback, and revisiting earlier assumptions—making it difficult to uphold core design principles in practice. To better understand these challenges, we conducted a formative study with 15 participants—comprised of UX practitioners, students, and instructors. Based on the findings, we developed StoryEnsemble, a tool that integrates AI into a node-link interface and leverages forward and backward propagation to support dynamic exploration and iteration across the design process. A user study with 10 participants showed that StoryEnsemble enables rapid, multi-directional iteration and flexible navigation across design stages. This work advances our understanding of how AI can foster more iterative design practices by introducing novel interactions that make exploration and iteration more fluid, accessible, and engaging.2025SSSangho Suh et al.Human-LLM CollaborationPrototyping & User TestingUIST
Squiggle: Multimodal Lasso Selection in the Real WorldSmart glasses devices are emerging with egocentric cameras and gaze tracking, raising the possibility of new interaction techniques that enable users to reference real-world objects they wish to digitally interact with. However, many of these devices lack a display, making precise object referencing difficult due to the lack of continuous visual feedback. We introduce Squiggle, an interaction technique that enables users to reference real-world objects without continuous feedback by drawing an invisible loop or "lasso" with an imagined ray-cast pointer. Through a virtual reality data collection, we observed that this gesture can elicit useful gaze behavior in addition to providing drawing input itself. Based on these results we implemented and evaluated a real-world prototype of Squiggle, demonstrating that it can improve accuracy of object referencing over Gaze + Pinch alone, particularly for selecting compound objects and groups.2025JFJacqui Fashimpaur et al.Hand Gesture RecognitionEye Tracking & Gaze InteractionContext-Aware ComputingUIST
Authoring LLM-Based Assistance for Real-World Contexts and TasksAdvances in AI hold the possibility of assisting users with highly varied and individual needs, but the breadth of assistance that these systems could provide creates a challenge for how users specify their goals to the system. To support the authoring of AI assistance for real-world tasks, we propose the concept of Contextually-Driven Prompts (CDPs) that define how an AI assistant should respond to real-world context. We implemented a prototype system for authoring and executing CDPs, which provides suggestions to assist users with finding the right level of assistance for their goal. We also conducted a user study (N=10) to investigate how participants express and refine their goals for real-world tasks. Results revealed a number of strategies for initiating and refining CDPs with suggestions, and implications for the design of future authoring interfaces.2025HDHai Dang et al.Human-LLM CollaborationContext-Aware ComputingIUI
Exploring the Design Space of Cognitive Engagement Techniques with AI-Generated Code for Enhanced LearningNovice programmers are increasingly relying on Large Language Models (LLMs) to generate code for learning programming concepts. However, this interaction can lead to superficial engagement, giving learners an illusion of learning and hindering skill development. To address this issue, we conducted a systematic design exploration to develop seven cognitive engagement techniques aimed at promoting deeper engagement with AI-generated code. In this paper, we describe our design process, the initial seven techniques and results from a between-subjects study (N=82). We then iteratively refined the top techniques and further evaluated them through a within-subjects study (N=42). We evaluate the friction each technique introduces, their effectiveness in helping learners apply concepts to isomorphic tasks without AI assistance, and their success in aligning learners' perceived and actual coding abilities. Ultimately, our results highlight the most effective technique: guiding learners through the step-by-step problem-solving process, where they engage in an interactive dialog with the AI, prompting what needs to be done at each stage before the corresponding code is revealed.2025MKMajeed Kazemitabaar et al.Human-LLM CollaborationProgramming Education & Computational ThinkingIUI
A Dynamic Bayesian Network Based Framework for Multimodal Context-Aware InteractionsMultimodal context-aware interactions integrate multiple sensory inputs, such as gaze, gestures, speech, and environmental signals, to provide adaptive support across diverse user contexts. Building such systems is challenging due to the complexity of sensor fusion, real-time decision-making, and managing uncertainties from noisy inputs. To address these challenges, we propose a hybrid approach combining a dynamic Bayesian network (DBN) with a large language model (LLM). The DBN offers a probabilistic framework for modeling variables, relationships, and temporal dependencies, enabling robust, real-time inference of user intent, while the LLM incorporates world knowledge for contextual reasoning beyond explicitly modeled relationships. We demonstrate our approach with a tri-level DBN implementation for tangible interactions, integrating gaze and hand actions to infer user intent in real time. A user evaluation with 10 participants in an everyday office scenario showed that our system can accurately and efficiently infer user intentions, achieving 0.83 per frame accuracy, even in complex environments. These results validate the effectiveness of the DBN+LLM framework for multimodal context-aware interactions.2025VHJoel Chan et al.Context-Aware ComputingComputational Methods in HCIIUI
Assistance or Disruption? Exploring and Evaluating the Design and Trade-offs of Proactive AI Programming SupportAI programming tools enable powerful code generation, and recent prototypes attempt to reduce user effort with proactive AI agents, but their impact on programming workflows remains unexplored. We introduce and evaluate Codellaborator, a design probe LLM agent that initiates programming assistance based on editor activities and task context. We explored three interface variants to assess trade-offs between increasingly salient AI support: prompt-only, proactive agent, and proactive agent with presence and context (Codellaborator). In a within-subject study (N=18), we find that proactive agents increase efficiency compared to prompt-only paradigm, but also incur workflow disruptions. However, presence indicators and interaction context support alleviated disruptions and improved users' awareness of AI processes. We underscore trade-offs of Codellaborator on user control, ownership, and code understanding, emphasizing the need to adapt proactivity to programming processes. Our research contributes to the design exploration and evaluation of proactive AI systems, presenting design implications on AI-integrated programming workflow.2025KPKevin Pu et al.University of Toronto, Department of Computer ScienceHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationPrototyping & User TestingCHI
IdeaSynth: Iterative Research Idea Development Through Evolving and Composing Idea Facets with Literature-Grounded FeedbackResearch ideation involves broad exploring and deep refining ideas. Both require deep engagement with literature. Existing tools focus primarily on broad idea generation, yet offer little support for iterative specification, refinement, and evaluation needed to further develop initial ideas. To bridge this gap, we introduce IdeaSynth, a research idea development system that uses LLMs to provide literature-grounded feedback for articulating research problems, solutions, evaluations, and contributions. IdeaSynth represents these idea facets as nodes on a canvas, and allow researchers to iteratively refine them by creating and exploring variations and combinations. Our lab study (𝑁 = 20) showed that participants, while using IdeaSynth, explored more alternative ideas and expanded initial ideas with more details compared to a strong LLM-based baseline. Our deployment study (𝑁 = 7) demonstrated that participants effectively used IdeaSynth for real-world research projects at various ideation stages from developing initial ideas to revising framings of mature manuscripts, highlighting the possibilities to adopt IdeaSynth in researcher’s workflows.2025KPKevin Pu et al.University of Toronto, Department of Computer ScienceHuman-LLM CollaborationCrowdsourcing Task Design & Quality ControlUser Research Methods (Interviews, Surveys, Observation)CHI
MaRginalia: Enabling In-person Lecture Capturing and Note-taking Through Mixed RealityStudents often take digital notes during live lectures, but current methods can be slow when capturing information from lecture slides or the instructor's speech, and require them to focus on their devices, leading to distractions and missing important details. This paper explores supporting live lecture note-taking with mixed reality (MR) to quickly capture lecture information and take notes while staying engaged with the lecture. A survey and interviews with university students revealed common note-taking behaviors and challenges to inform the design. We present MaRginalia to provide digital note-taking with a stylus tablet and MR headset. Students can take notes with an MR representation of the tablet, lecture slides, and audio transcript without looking down at their device. When preferred, students can also perform detailed interactions by looking at the physical tablet. We demonstrate the feasibility and usefulness of MaRginalia and MR-based note-taking in a user study with 12 students.2025LQLeping Qiu et al.University of Toronto, Department of Computer ScienceMixed Reality WorkspacesCollaborative Learning & Peer TeachingCHI
Paratrouper: Exploratory Creation of Character Cast Visuals Using Generative AIGreat characters are critical to the success of many forms of media, such as comics, games, and films. Designing visually compelling casts of characters requires significant skill and consideration, and there is a lack of specialized tools to support this endeavor. We investigate how AI-driven image-generation techniques can empower creatives to explore a variety of visual design possibilities for individual and groups of characters. Informed by interviews with character designers, Paratrouper is a multi-modal system that enables creating and experimenting with multiple permutations for character casts and visualizing them in various contexts as part of a holistic approach to design. We demonstrate how Paratrouper supports different aspects of the character design process, and share insights from its use by eight creators. Our work highlights the interplay between creative agency and serendipity, as well as the visual interrelationships among character aesthetics.2025JLJoanne Leong et al.MIT, MIT Media LabGenerative AI (Text, Image, Music, Video)3D Modeling & AnimationCHI
A Multimodal Approach for Targeting Error Detection in Virtual Reality Using Implicit User BehaviorAlthough the point-and-select interaction method has been shown to lead to user and system-initiated errors, it is still prevalent in VR scenarios. Current solutions to facilitate selection interactions exist, however they do not address the challenges caused by targeting inaccuracy. To reduce the effort required to target objects, we developed a model that quickly detected targeting errors after they occurred. The model used implicit multimodal user behavioral data to identify possible targeting outcomes. Using a dataset composed of 23 participants engaged in VR targeting tasks, we then trained a deep learning model to differentiate between correct and incorrect targeting events within 0.5 seconds of a selection, resulting in an AUC-ROC of 0.9. The utility of this model was then evaluated in a user study with 25 participants that identified that participants recovered from more errors and faster when assisted by the model. These results advance our understanding of targeting errors in VR and facilitate the design of future intelligent error-aware systems.2025NSNaveen Sendhilnathan et al.MetaSocial & Collaborative VRImmersion & Presence ResearchHuman-LLM CollaborationCHI
Improving Steering and Verification in AI-Assisted Data Analysis with Interactive Task DecompositionLLM-powered tools like ChatGPT Data Analysis, have the potential to help users tackle the challenging task of data analysis programming, which requires expertise in data processing, programming, and statistics. However, our formative study (n=15) uncovered serious challenges in verifying AI-generated results and steering the AI (i.e., guiding the AI system to produce the desired output). We developed two contrasting approaches to address these challenges. The first (Stepwise) decomposes the problem into step-by-step subgoals with pairs of editable assumptions and code until task completion, while the second (Phasewise) decomposes the entire problem into three editable, logical phases: structured input/output assumptions, execution plan, and code. A controlled, within-subjects experiment (n=18) compared these systems against a conversational baseline. Users reported significantly greater control with the Stepwise and Phasewise systems, and found intervention, correction, and verification easier, compared to the baseline. The results suggest design guidelines and trade-offs for AI-assisted data analysis tools.2024MKMajeed Kazemitabaar et al.Human-LLM CollaborationExplainable AI (XAI)Interactive Data VisualizationUIST
Desk2Desk: Optimization-based Mixed Reality Workspace Integration for Remote Side-by-side CollaborationMixed Reality enables hybrid workspaces where physical and virtual monitors are adaptively created and moved to suit the current environment and needs. However, in shared settings, individual users’ workspaces are rarely aligned and can vary significantly in the number of monitors, available physical space, and workspace layout, creating inconsistencies between workspaces which may cause confusion and reduce collaboration. We present Desk2Desk, an optimization-based approach for remote collaboration in which the hybrid workspaces of two collaborators are fully integrated to enable immersive side-by-side collaboration. The optimization adjusts each user’s workspace in layout and number of shared monitors and creates a mapping between workspaces to handle inconsistencies between workspaces due to physical constraints (e.g. physical monitors). We show in a user study how our system adaptively merges dissimilar physical workspaces to enable immersive side-by-side collaboration, and demonstrate how an optimization-based approach can effectively address dissimilar physical layouts.2024LSLudwig Sidenmark et al.Mixed Reality WorkspacesDistributed Team CollaborationUIST
GraspUI: Seamlessly Integrating Object-Centric Gestures within the Seven Phases of GraspingObjects are indispensable tools in our daily lives. Recent research has demonstrated their potential to act as conduits for digital interactions with microgestures, however, the primary focus was on situations where the hand firmly grasps an object. We introduce GraspUI, an exploratory design space of object-centric gestures within the seven distinct phases of the grasping process, spanning pre-, during, and post-grasp movements. We conducted ideation sessions with mixed-reality designers from industry and academia to explore gesture integration throughout the entire grasping process. The outcome was 38 storyboards envisioning practical applications. To evaluate the design space’s utility, we performed a video-based assessment with end-users. We then implemented an interactive prototype and quantified the overhead cost of performing proposed gestures through a secondary study. Participants reacted positively to gestures and could integrate them into existing usage of objects. To conclude, we highlight technical and usability guidelines for implementing and extending GraspUI systems.2024ASAdwait Sharma et al.Haptic WearablesShape-Changing Interfaces & Soft Robotic MaterialsHand Gesture RecognitionDIS
Body Language for VUIs: Exploring Gestures to Enhance Interactions with Voice User InterfacesWith the progress in Large Language Models (LLMs) and rapid development of wearable smart devices like smart glasses, there is a growing opportunity for users to interact with on-device virtual assistants through voice and gestures with ease. Although voice user interfaces (VUIs) have been widely studied, the potential uses of full-body gestures in VUIs that can fully understand users' surroundings and gestures are relatively unexplored. In this two-phase research using a Wizard-of-Oz approach, we aim to investigate the role of gestures in VUI interactions and explore their design space. In an initial exploratory user study with six participants, we identify influential factors for VUI gestures and establish an initial design space. In the second phase, we conducted a user study with 12 participants to validate and refine our initial findings. Our results showed that users are open and ready to adopt and utilize gestures to interact with multi-modal VUIs, especially in scenarios with poor voice capture quality. The study also highlighted three key categories of gesture functions for enhancing multi-modal VUI interactions: context reference, alternative input, and flow control.Finally, we present a design space for multi-modal VUI gestures along with demonstrations to enlighten future design for coupling multi-modal VUIs with gestures.2024LWLiwei Wu et al.Hand Gesture RecognitionFull-Body Interaction & Embodied InputVoice User Interface (VUI) DesignDIS
Fidgets: Building Blocks for a Predictive UI ToolkitThe rapid growth of AR platforms, combined with the rising predictive power of intelligent systems, will fundamentally change interactive computing. Interaction will increasingly happen on the go, causing I/O to become constrained, ultimately leading to reliance on user intent prediction for aid. In this pictorial, we argue that to support the development of such systems, new predictive UI toolkits are required. We place the reader in the shoes of an App designer and outline the challenges that will be faced. We then describe a new predictive toolkit, leveraging Fuzzy Widgets, or “Fidgets” as the main UI building block. Fidgets extend Responsive Design into the realm of intelligent systems, to adapt not only to spatial constraints, but to system predictions as well. We then describe a working implementation of a predictive music application, built using our described framework, showcasing its benefits and range of adaptive abilities.2024JCJoannes Chan et al.AR Navigation & Context AwarenessRecommender System UXContext-Aware ComputingDIS
ABScribe: Rapid Exploration & Organization of Multiple Writing Variations in Human-AI Co-Writing Tasks using Large Language ModelsExploring alternative ideas by rewriting text is integral to the writing process. State-of-the-art Large Language Models (LLMs) can simplify writing variation generation. However, current interfaces pose challenges for simultaneous consideration of multiple variations: creating new variations without overwriting text can be difficult, and pasting them sequentially can clutter documents, increasing workload and disrupting writers' flow. To tackle this, we present ABScribe, an interface that supports rapid, yet visually structured, exploration and organization of writing variations in human-AI co-writing tasks. With ABScribe, users can swiftly modify variations using LLM prompts, which are auto-converted into reusable buttons. Variations are stored adjacently within text fields for rapid in-place comparisons using mouse-over interactions on a popup toolbar. Our user study with 12 writers shows that ABScribe significantly reduces task workload (d = 1.20, p < 0.001), enhances user perceptions of the revision process (d = 2.41, p < 0.001) compared to a popular baseline workflow, and provides insights into how writers explore variations using LLMs.2024MRMohi Reza et al.University of TorontoHuman-LLM CollaborationPrototyping & User TestingCHI
PhoneInVR: An Evaluation of Spatial Anchoring and Interaction Techniques for Smartphone Usage in Virtual RealityWhen users wear a virtual reality (VR) headset, they lose access to their smartphone and accompanying apps. Past work has proposed smartphones as enhanced VR controllers, but little work has explored using existing smartphone apps and performing traditional smartphone interactions while in VR. In this paper, we consider three potential spatial anchorings for rendering smartphones in VR: On top of a tracked physical smartphone which the user holds (Phone-locked), on top of the user’s empty hand, as if holding a virtual smartphone (Hand-locked), or in a static position in front of the user (World-locked). We conducted a comparative study of target acquisition, swiping, and scrolling tasks across these anchorings using direct Touch or above-the-surface Pinch. Our findings indicate that physically holding a smartphone with Touch improves accuracy and speed for all tasks, and Pinch performed better with virtual smartphones. These findings provide a valuable foundation to enable smartphones in VR.2024FZFengyuan Zhu et al.University of TorontoEye Tracking & Gaze InteractionMixed Reality WorkspacesCHI
SwitchSpace: Understanding Context-Aware Peeking Between VR and Desktop InterfacesCross-reality tasks, like creating or consuming virtual reality (VR) content, often involve inconvenient or distracting switches between desktop and VR. An initial formative study explores cross-reality switching habits, finding most switches are momentary "peeks" between interfaces, with specific habits determined by current context. The results inform a design space for context-aware "peeking" techniques that allow users to view or interact with desktop from VR, and vice versa, without fully switching. We implemented a set of peeking techniques and evaluated them in two levels of a cross-reality task: one requiring only viewing, and another requiring input and viewing. Peeking techniques made task completion faster, with increased input accuracy and reduced perceived workload.2024JWJohann Wentzel et al.University of WaterlooMixed Reality WorkspacesContext-Aware ComputingCHI