ImaginationVellum: Generative-AI Ideation Canvas with Spatial Prompts, Generative Strokes, and Ideation HistoryWe introduce ImaginationVellum, a multi-modal spatial canvas for early-stage visual ideation and concept sketching with generative AI. The resulting system supports a unique style of human-AI co-creation where the canvas is the prompt. This means that ImaginationVellum employs the entire 2D canvas as an active prompt space, where spatial arrangement, proximity, and composition of diverse content elements - inking, text, images, and intermediate results - steer generative visual outcomes. As a technical probe, ImaginationVellum contributes a set of spatially-grounded direct manipulation tools for iterative visual ideation. In particular, we introduce Generative Strokes - freeform strokes that spatially modulate generation and prompt-parameters (articulated along multiple latent semantic or stylistic dimensions). These techniques afford rapid traversal of design spaces via convergence, divergence, re-composition, blending, and remixing of concepts. We detail the system architecture, design rationale, proximity-dependent intent tags for localized control, and methods for spatial prompting and varying output along spatial gradients. Temporal replay and visualization of provenance make ideation trajectories actionable, turning the design process itself into an artifact that supports reflection-in-action and revisitation of design decisions. We report insights from a preliminary study of how users construct, steer, and revisit ideas using spatial prompts, and discuss tradeoffs in modulating spatially-dependent content generation.2025NMNicolai Marquardt et al.Generative AI (Text, Image, Music, Video)Creative Collaboration & Feedback SystemsUIST
Intent Tagging: Exploring Micro-Prompting Interactions for Supporting Granular Human-GenAI Co-Creation WorkflowsDespite Generative AI (GenAI) systems' potential for enhancing content creation, users often struggle to effectively integrate GenAI into their creative workflows. Core challenges include misalignment of AI-generated content with user intentions (intent elicitation and alignment), user uncertainty around how to best communicate their intents to the AI system (prompt formulation), and insufficient flexibility of AI systems to support diverse creative workflows (workflow flexibility). Motivated by these challenges, we created IntentTagger: a system for slide creation based on the notion of Intent Tags—small, atomic conceptual units that encapsulate user intent—for exploring granular and non-linear micro-prompting interactions for Human-GenAI co-creation workflows. Our user study with 12 participants provides insights into the value of flexibly expressing intent across varying levels of ambiguity, meta-intent elicitation, and the benefits and challenges of intent tag-driven workflows. We conclude by discussing the broader implications of our findings and design considerations for GenAI-supported content creation workflows.2025FGFrederic Gmeiner et al.Carnegie Mellon University, Human-Computer Interaction Institute; Microsoft ResearchGenerative AI (Text, Image, Music, Video)AI-Assisted Creative WritingCreative Collaboration & Feedback SystemsCHI
AI-Instruments: Embodying Prompts as Instruments to Abstract & Reflect Graphical Interface Commands as General-Purpose ToolsChat-based prompts respond with verbose linear-sequential texts, making it difficult to explore and refine ambiguous intents, back up and reinterpret, or shift directions in creative AI-assisted design work. AI-Instruments instead embody "prompts" as interface objects via three key principles: (1) Reification of user-intent as reusable direct-manipulation instruments; (2) Reflection of multiple interpretations of ambiguous user-intents (Reflection-in-intent) as well as the range of AI-model responses (Reflection-in-response) to inform design "moves" towards a desired result; and (3) Grounding to instantiate an instrument from an example, result, or extrapolation directly from another instrument. Further, AI-Instruments leverage LLM’s to suggest, vary, and refine new instruments, enabling a system that goes beyond hard-coded functionality by generating its own instrumental controls from content. We demonstrate four technology probes, applied to image generation, and qualitative insights from twelve participants, showing how AI-Instruments address challenges of intent formulation, steering via direct manipulation, and non-linear iterative workflows to reflect and resolve ambiguous intents.2025NRNathalie Henry Riche et al.Microsoft ResearchGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationCHI
Big or Small, It’s All in Your Head: Visuo-Haptic Illusion of Size-Change Using Finger-RepositioningHaptic perception of physical sizes increases the realism and immersion in Virtual Reality (VR). Prior work rendered sizes by exerting pressure on the user’s fingertips or employing tangible, shape-changing devices. These interfaces are constrained by the physical shapes they can assume, making it challenging to simulate objects growing larger or smaller than the perceived size of the interface. Motivated by literature on pseudo-haptics describing the strong influence of visuals over haptic perception, this work investigates modulating the perception of size beyond this range. We developed a fixed-sized VR controller leveraging finger-repositioning to create a visuo-haptic illusion of dynamic size-change of handheld virtual objects. Through two user studies, we found that with an accompanying size-changing visual context, users can perceive virtual object sizes up to 44.2% smaller to 160.4%larger than the perceived size of the device. Without the accompanying visuals, a constant size (141.4% of device size) was perceived.2024MKMyung Jin Kim et al.KAISTForce Feedback & Pseudo-Haptic WeightShape-Changing Interfaces & Soft Robotic MaterialsFull-Body Interaction & Embodied InputCHI
Beyond Audio: Towards a Design Space of Headphones as a Site for Interaction and SensingVia Research through Design (RtD), we explore the potential of headphones as a general-purpose input device for both foreground motion-gestures as well as background sensing of user activity. As a familiar wearable device, headphones offer a compelling site for head-situated interaction and sensing. Using emerging sensing modalities such as inertial motion, capacitive touch sensing, and depth cameras, our implemented prototypes explore sensing and interaction techniques that offer a range of compelling capabilities. User scenarios include context-aware privacy, gestural audio-visual control, and co-opting natural body language as context to drive animated avatars for "camera-off" scenarios in remote work--or to co-opt (oft-subconscious) head movements such as dodging attacks in video games to enhance the gameplay experience. Drawing from literature and other frameworks, we situate our prototypes and related techniques in a design space across the dual dimensions of (1) type of input (touch, mid-air, or head orientation); and (2) the context of user action (application, body, or environment). In particular, interactions that combine multiple inputs and contexts at the same time offer a rich design space of headphone-situated wearable interactions and sensing techniques.2023PPPayod Panda et al.Haptic WearablesFull-Body Interaction & Embodied InputContext-Aware ComputingDIS
Escapement: A Tool for Interactive Prototyping with Video via Sensor-Mediated Abstraction of Time We present Escapement, a video prototyping tool that introduces a powerful new concept for prototyping screen-based interfaces by flexibly mapping sensor values to dynamic playback control of videos. This recasts the time dimension of video mock-ups as sensor-mediated interaction. This abstraction of time as interaction, which we dub video-escapement prototyping, empowers designers to rapidly explore and viscerally experience direct touch or sensor-mediated interactions across one or more device displays. Our system affords cross-device and bidirectional remote (tele-present) experiences via cloud-based state sharing across multiple devices. This makes Escapement especially potent for exploring multi-device, dual-screen, or remote-work interactions for screen-based applications. We introduce the core concept of sensor-mediated abstraction of time for quickly generating video-based interactive prototypes of screen-based applications, share the results of observations of long-term usage of video-escapement techniques with experienced interaction designers, and articulate design choices for supporting a reflective, iterative, and open-ended creative design process.2023MNMolly Jane Nicholas et al.UC BerkeleyTeleoperation & TelepresencePrototyping & User TestingCHI
AdHocProx: Sensing Mobile, Ad-Hoc Collaborative Device Formations using Dual Ultra-Wideband RadiosWe present AdHocProx, a system that uses device-relative, inside-out sensing to augment co-located collaboration across multiple devices, without recourse to externally-anchored beacons -- or even reliance on WiFi connectivity. AdHocProx achives this via sensors including dual ultra-wideband (UWB) radios for sensing distance and angle to other devices in dynamic, ad-hoc arrangements; plus capacitive grip to determine where the user's hands hold the device, and to partially correct for the resulting UWB signal attenuation. All spatial sensing and communication takes place via the side-channel capability of the UWB radios, suitable for small-group collaboration across up to four devices (eight UWB radios). Together, these sensors detect proximity and natural, socially meaningful device movements to enable contextual interaction techniques. We find that AdHocProx can obtain 95% accuracy recognizing various ad-hoc device arrangements in an offline evaluation, with participants particularly appreciative of interaction techniques that automatically leverage proximity-awareness and relative orientation amongst multiple devices.2023RLRichard Li et al.University of WashingtonContext-Aware ComputingUbiquitous ComputingCHI
FlowAR: How Different Augmented Reality Visualizations of Online Fitness Videos Support Flow for At-Home Yoga ExercisesOnline fitness video tutorials are an increasingly popular way to stay fit at home without a personal trainer. However, to keep the screen playing the video in view, users typically disrupt their balance and break the motion flow --- two main pillars for the correct execution of yoga poses. While past research partially addressed this problem, these approaches supported only a limited view of the instructor and simple movements. To enable the fluid execution of complex full-body yoga exercises, we propose FlowAR, an augmented reality system for home workouts that shows training video tutorials as always-present virtual static and dynamic overlays around the user. We tested different overlay layouts in a study with 16 participants, using motion capture equipment for baseline performance. Then, we iterated the prototype and tested it in a furnished lab simulating home settings with 12 users. Our results highlight the advantages of different visualizations and the system's general applicability.2023HJHye-Young Jo et al.KAISTAR Navigation & Context AwarenessFitness Tracking & Physical Activity MonitoringCHI
SpinOcchio: Understanding Haptic-Visual Congruency of Skin-Slip in VR with a Dynamic Grip ControllerThis paper's goal is to understand the haptic-visual congruency perception of skin-slip on the fingertips given visual cues in Virtual Reality (VR). We developed SpinOcchio ('Spin' for the spinning mechanism used, 'Occhio' for the Italian word “eye”), a handheld haptic controller capable of rendering the thickness and slipping of a virtual object pinched between two fingers. This is achieved using a mechanism with spinning and pivoting disks that apply a tangential skin-slip movement to the fingertips. With SpinOcchio, we determined the baseline haptic discrimination threshold for skin-slip, and, using these results, we tested how haptic realism of motion and thickness is perceived with varying visual cues in VR. Surprisingly, the results show that in all cases, visual cues dominate over haptic perception. Based on these results, we suggest applications that leverage skin-slip and grip interaction, contributing further to realistic experiences in VR.2022MKMyung Jin Kim et al.KAISTHaptic WearablesBrain-Computer Interface (BCI) & NeurofeedbackCHI
Understanding Multi-Device Usage Patterns: Physical Device Configurations and Fragmented WorkflowsTo better ground technical (systems) investigation and interaction design of cross-device experiences, we contribute an in-depth survey of existing multi-device practices, including fragmented workflows across devices and the way people physically organize and configure their workspaces to support such activity. Further, this survey documents a historically significant moment of transition to a new future of remote work, an existing trend dramatically accelerated by the abrupt switch to work-from-home (and having to contend with the demands of home-at-work) during the COVID-19 pandemic. We surveyed 97 participants, and collected photographs of home setups and open-ended answers to 50 questions categorized in 5 themes. We characterize the wide range of multi-device physical configurations and identify five usage patterns, including: partitioning tasks, integrating multi-device usage, cloning tasks to other devices, expanding tasks and inputs to multiple devices, and migrating between devices. Our analysis also sheds light on the benefits and challenges people face when their workflow is fragmented across multiple devices. These insights have implications for the design of multi-device experiences that support people's fragmented workflows.2022YYYe Yuan et al.Microsoft Research, University of MinnesotaRemote Work Tools & ExperienceDistributed Team CollaborationNotification & Interruption ManagementCHI
AirConstellations: In-Air Device Formations for Cross-Device Interaction via Multiple Spatially-Aware ArmaturesAirConstellations supports a unique semi-fixed style of cross-device interactions via multiple self-spatially-aware armatures to which users can easily attach (or detach) tablets and other devices. In particular, AirConstellations affords highly flexible and dynamic device formations where the users can bring multiple devices together in-air – with 2-5 armatures poseable in 7DoF within the same workspace – to suit the demands of their current task, social situation, app scenario, or mobility needs. This affords an interaction metaphor where relative orientation, proximity, attaching (or detaching) devices, and continuous movement into and out of ad-hoc ensembles can drive context-sensitive interactions. Yet all devices remain self-stable in useful configurations even when released in mid-air. We explore flexible physical arrangement, feedforward of transition options, and layering of devices in-air across a variety of multi-device app scenarios. These include video conferencing with flexible arrangement of the person-space of multiple remote participants around a shared task-space, layered and tiled device formations with overview+detail and shared-to-personal transitions, and flexible composition of UI panels and tool palettes across devices for productivity applications. A preliminary interview study highlights user reactions to AirConstellations, such as for minimally disruptive device formations, easier physical transitions, and balancing "seeing and being seen" in remote work.2021NMNicolai Marquardt et al.Head-Up Display (HUD) & Advanced Driver Assistance Systems (ADAS)Knowledge Management & Team AwarenessUbiquitous ComputingUIST
GamesBond: Bimanual Haptic Illusion of Physically Connected Objects for Immersive VR Using Grip DeformationVirtual Reality experiences, such as games and simulations, typically support the usage of bimanual controllers to interact with virtual objects. To recreate the haptic sensation of holding objects of various shapes and behaviors with both hands, previous researchers have used mechanical linkages between the controllers that render adjustable stiffness. However, the linkage cannot quickly adapt to simulate dynamic objects, nor it can be removed to support free movements. This paper introduces GamesBond, a pair of 4-DoF controllers without physical linkage but capable to create the illusion of being connected as a single device, forming a virtual bond. The two controllers work together by dynamically displaying and physically rendering deformations of hand grips, and so allowing users to perceive a single connected object between the hands, such as a jumping rope. With a user study and various applications we show that GamesBond increases the realism, immersion, and enjoyment of bimanual interaction.2021NRNeung Ryu et al.KAISTIn-Vehicle Haptic, Audio & Multimodal FeedbackShape-Changing Interfaces & Soft Robotic MaterialsFull-Body Interaction & Embodied InputCHI
SurfaceFleet: Exploring Distributed Interactions Unbounded from Device, Application, User, and TimeKnowledge work increasingly spans multiple computing surfaces. Yet in status quo user experiences, content as well as tools, behaviors, and workflows are largely bound to the current device—running the current application, for the current user, and at the current moment in time. SurfaceFleet is a system and toolkit that uses resilient distributed programming techniques to explore cross-device interactions that are unbounded in these four dimensions of device, application, user, and time. As a reference implementation, we describe an interface built using Surface Fleet that employs lightweight, semi-transparent UI elements known as Applets. Applets appear always-on-top of the operating system, application windows, and (conceptually) above the device itself. But all connections and synchronized data are virtualized and made resilient through the cloud. For example, a sharing Applet known as a Portfolio allows a user to drag and drop unbound Interaction Promises into a document. Such promises can then be fulfilled with content asynchronously, at a later time (or multiple times), from another device, and by the same or a different user.2020FBFrederik Brudy et al.Distributed Team CollaborationKnowledge Worker Tools & WorkflowsUIST
Tilt-Responsive Techniques for Digital Drawing BoardsDrawing boards offer a self-stable work surface that is continuously adjustable. On digital displays, such as the Microsoft Surface Studio, these properties open up a class of techniques that sense and respond to tilt adjustments. Each display posture—whether angled high, low, or somewhere in-between—affords some activities, but not others. Because what is appropriate also depends on the application and task, we explore a range of app-specific transitions between reading vs. writing (annotation), public vs. personal, shared person-space vs. task-space, and other nuances of input and feedback, contingent on display angle. Continuous responses provide interactive transitions tailored to each use-case. We show how a variety of knowledge work scenarios can use sensed display adjustments to drive context-appropriate transitions, as well as technical software details of how to best realize these concepts. A preliminary remote user study suggests that techniques must balance effort required to adjust tilt, versus the potential benefits of a sensed transition.2020HRHugo Romat et al.Knowledge Worker Tools & WorkflowsNotification & Interruption ManagementUIST
Sensing Posture-Aware Pen+Touch Interaction on TabletsMany status-quo interfaces for tablets with pen + touch input capabilities force users to reach for device-centric UI widgets at fixed locations, rather than sensing and adapting to the user-centric posture. To address this problem, we propose sensing techniques that transition between various nuances of mobile and stationary use via postural awareness. These postural nuances include shifting hand grips, varying screen angle and orientation, planting the palm while writing or sketching, and detecting what direction the hands approach from. To achieve this, our system combines three sensing modalities: 1) raw capacitance touchscreen images, 2) inertial motion, and 3) electric field sensors around the screen bezel for grasp and hand proximity detection. We show how these sensors enable posture-aware pen+touch techniques that adapt interaction and morph user interface elements to suit fine-grained contexts of body-, arm-, hand-, and grip-centric frames of reference.2019YZYang Zhang et al.Microsoft Research & Carnegie Mellon UniversityHand Gesture RecognitionHuman Pose & Activity RecognitionCHI
DataToon: Drawing Dynamic Network Comics With Pen + Touch InteractionComics are an entertaining and familiar medium for presenting compelling stories about data. However, existing visualization authoring tools do not leverage this expressive medium. In this paper, we seek to incorporate elements of comics into the construction of data-driven stories about dynamic networks. We contribute DataToon, a flexible data comic storyboarding tool that blends analysis and presentation with pen and touch interactions. A storyteller can use DataToon rapidly generate visualization panels, annotate them, and position them within a canvas to produce a visually compelling narrative. In a user study, participants quickly learned to use DataToon for producing data comics.2019NKNam Wook Kim et al.Microsoft Research & Harvard UniversityInteractive Data VisualizationData StorytellingCreative Coding & Computational ArtCHI