ImaginationVellum: Generative-AI Ideation Canvas with Spatial Prompts, Generative Strokes, and Ideation HistoryWe introduce ImaginationVellum, a multi-modal spatial canvas for early-stage visual ideation and concept sketching with generative AI. The resulting system supports a unique style of human-AI co-creation where the canvas is the prompt. This means that ImaginationVellum employs the entire 2D canvas as an active prompt space, where spatial arrangement, proximity, and composition of diverse content elements - inking, text, images, and intermediate results - steer generative visual outcomes. As a technical probe, ImaginationVellum contributes a set of spatially-grounded direct manipulation tools for iterative visual ideation. In particular, we introduce Generative Strokes - freeform strokes that spatially modulate generation and prompt-parameters (articulated along multiple latent semantic or stylistic dimensions). These techniques afford rapid traversal of design spaces via convergence, divergence, re-composition, blending, and remixing of concepts. We detail the system architecture, design rationale, proximity-dependent intent tags for localized control, and methods for spatial prompting and varying output along spatial gradients. Temporal replay and visualization of provenance make ideation trajectories actionable, turning the design process itself into an artifact that supports reflection-in-action and revisitation of design decisions. We report insights from a preliminary study of how users construct, steer, and revisit ideas using spatial prompts, and discuss tradeoffs in modulating spatially-dependent content generation.2025NMNicolai Marquardt et al.Generative AI (Text, Image, Music, Video)Creative Collaboration & Feedback SystemsUIST
Intent Tagging: Exploring Micro-Prompting Interactions for Supporting Granular Human-GenAI Co-Creation WorkflowsDespite Generative AI (GenAI) systems' potential for enhancing content creation, users often struggle to effectively integrate GenAI into their creative workflows. Core challenges include misalignment of AI-generated content with user intentions (intent elicitation and alignment), user uncertainty around how to best communicate their intents to the AI system (prompt formulation), and insufficient flexibility of AI systems to support diverse creative workflows (workflow flexibility). Motivated by these challenges, we created IntentTagger: a system for slide creation based on the notion of Intent Tags—small, atomic conceptual units that encapsulate user intent—for exploring granular and non-linear micro-prompting interactions for Human-GenAI co-creation workflows. Our user study with 12 participants provides insights into the value of flexibly expressing intent across varying levels of ambiguity, meta-intent elicitation, and the benefits and challenges of intent tag-driven workflows. We conclude by discussing the broader implications of our findings and design considerations for GenAI-supported content creation workflows.2025FGFrederic Gmeiner et al.Carnegie Mellon University, Human-Computer Interaction Institute; Microsoft ResearchGenerative AI (Text, Image, Music, Video)AI-Assisted Creative WritingCreative Collaboration & Feedback SystemsCHI
AI-Instruments: Embodying Prompts as Instruments to Abstract & Reflect Graphical Interface Commands as General-Purpose ToolsChat-based prompts respond with verbose linear-sequential texts, making it difficult to explore and refine ambiguous intents, back up and reinterpret, or shift directions in creative AI-assisted design work. AI-Instruments instead embody "prompts" as interface objects via three key principles: (1) Reification of user-intent as reusable direct-manipulation instruments; (2) Reflection of multiple interpretations of ambiguous user-intents (Reflection-in-intent) as well as the range of AI-model responses (Reflection-in-response) to inform design "moves" towards a desired result; and (3) Grounding to instantiate an instrument from an example, result, or extrapolation directly from another instrument. Further, AI-Instruments leverage LLM’s to suggest, vary, and refine new instruments, enabling a system that goes beyond hard-coded functionality by generating its own instrumental controls from content. We demonstrate four technology probes, applied to image generation, and qualitative insights from twelve participants, showing how AI-Instruments address challenges of intent formulation, steering via direct manipulation, and non-linear iterative workflows to reflect and resolve ambiguous intents.2025NRNathalie Henry Riche et al.Microsoft ResearchGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationCHI
Beyond Audio: Towards a Design Space of Headphones as a Site for Interaction and SensingVia Research through Design (RtD), we explore the potential of headphones as a general-purpose input device for both foreground motion-gestures as well as background sensing of user activity. As a familiar wearable device, headphones offer a compelling site for head-situated interaction and sensing. Using emerging sensing modalities such as inertial motion, capacitive touch sensing, and depth cameras, our implemented prototypes explore sensing and interaction techniques that offer a range of compelling capabilities. User scenarios include context-aware privacy, gestural audio-visual control, and co-opting natural body language as context to drive animated avatars for "camera-off" scenarios in remote work--or to co-opt (oft-subconscious) head movements such as dodging attacks in video games to enhance the gameplay experience. Drawing from literature and other frameworks, we situate our prototypes and related techniques in a design space across the dual dimensions of (1) type of input (touch, mid-air, or head orientation); and (2) the context of user action (application, body, or environment). In particular, interactions that combine multiple inputs and contexts at the same time offer a rich design space of headphone-situated wearable interactions and sensing techniques.2023PPPayod Panda et al.Haptic WearablesFull-Body Interaction & Embodied InputContext-Aware ComputingDIS
Escapement: A Tool for Interactive Prototyping with Video via Sensor-Mediated Abstraction of Time We present Escapement, a video prototyping tool that introduces a powerful new concept for prototyping screen-based interfaces by flexibly mapping sensor values to dynamic playback control of videos. This recasts the time dimension of video mock-ups as sensor-mediated interaction. This abstraction of time as interaction, which we dub video-escapement prototyping, empowers designers to rapidly explore and viscerally experience direct touch or sensor-mediated interactions across one or more device displays. Our system affords cross-device and bidirectional remote (tele-present) experiences via cloud-based state sharing across multiple devices. This makes Escapement especially potent for exploring multi-device, dual-screen, or remote-work interactions for screen-based applications. We introduce the core concept of sensor-mediated abstraction of time for quickly generating video-based interactive prototypes of screen-based applications, share the results of observations of long-term usage of video-escapement techniques with experienced interaction designers, and articulate design choices for supporting a reflective, iterative, and open-ended creative design process.2023MNMolly Jane Nicholas et al.UC BerkeleyTeleoperation & TelepresencePrototyping & User TestingCHI
AdHocProx: Sensing Mobile, Ad-Hoc Collaborative Device Formations using Dual Ultra-Wideband RadiosWe present AdHocProx, a system that uses device-relative, inside-out sensing to augment co-located collaboration across multiple devices, without recourse to externally-anchored beacons -- or even reliance on WiFi connectivity. AdHocProx achives this via sensors including dual ultra-wideband (UWB) radios for sensing distance and angle to other devices in dynamic, ad-hoc arrangements; plus capacitive grip to determine where the user's hands hold the device, and to partially correct for the resulting UWB signal attenuation. All spatial sensing and communication takes place via the side-channel capability of the UWB radios, suitable for small-group collaboration across up to four devices (eight UWB radios). Together, these sensors detect proximity and natural, socially meaningful device movements to enable contextual interaction techniques. We find that AdHocProx can obtain 95% accuracy recognizing various ad-hoc device arrangements in an offline evaluation, with participants particularly appreciative of interaction techniques that automatically leverage proximity-awareness and relative orientation amongst multiple devices.2023RLRichard Li et al.University of WashingtonContext-Aware ComputingUbiquitous ComputingCHI
Understanding Multi-Device Usage Patterns: Physical Device Configurations and Fragmented WorkflowsTo better ground technical (systems) investigation and interaction design of cross-device experiences, we contribute an in-depth survey of existing multi-device practices, including fragmented workflows across devices and the way people physically organize and configure their workspaces to support such activity. Further, this survey documents a historically significant moment of transition to a new future of remote work, an existing trend dramatically accelerated by the abrupt switch to work-from-home (and having to contend with the demands of home-at-work) during the COVID-19 pandemic. We surveyed 97 participants, and collected photographs of home setups and open-ended answers to 50 questions categorized in 5 themes. We characterize the wide range of multi-device physical configurations and identify five usage patterns, including: partitioning tasks, integrating multi-device usage, cloning tasks to other devices, expanding tasks and inputs to multiple devices, and migrating between devices. Our analysis also sheds light on the benefits and challenges people face when their workflow is fragmented across multiple devices. These insights have implications for the design of multi-device experiences that support people's fragmented workflows.2022YYYe Yuan et al.Microsoft Research, University of MinnesotaRemote Work Tools & ExperienceDistributed Team CollaborationNotification & Interruption ManagementCHI
Style Blink: Exploring Digital Inking of Structured Information via Handcrafted Styling as a First-Class ObjectStructured note-taking forms such as sketchnoting, self-tracking journals, and bullet journaling go beyond immediate capture of information scraps. Instead, hand-drawn pride-in-craftmanship increases perceived value for sharing and display. But hand-crafting lists, tables, and calendars is tedious and repetitive. To support these practices digitally, Style Blink (“Style-Blocks+Ink”) explores handcrafted styling as a first-class object. Style-blocks encapsulate digital ink, enabling people to craft, modify, and reuse embellishments and decorations for larger structures, and apply custom layouts. For example, we provide interaction instruments that style ink for personal expression, inking palettes that afford creative experimentation, fillable pens that can be “loaded” with commands and actions to replace menu selections, techniques to customize inked structures post-creation by modifying the underlying handcrafted style-blocks and to re-layout the overall structure to match users' preferred template. In effect, any ink stroke, notation, or sketch can be encapsulated as a style-object and re-purposed as a tool. Feedback from 13 users show the potential of style adaptation and re-use in individual sketching practices.2022HRHugo Romat et al.MicrosoftGraphic Design & Typography ToolsCreative Coding & Computational ArtCHI
AirConstellations: In-Air Device Formations for Cross-Device Interaction via Multiple Spatially-Aware ArmaturesAirConstellations supports a unique semi-fixed style of cross-device interactions via multiple self-spatially-aware armatures to which users can easily attach (or detach) tablets and other devices. In particular, AirConstellations affords highly flexible and dynamic device formations where the users can bring multiple devices together in-air – with 2-5 armatures poseable in 7DoF within the same workspace – to suit the demands of their current task, social situation, app scenario, or mobility needs. This affords an interaction metaphor where relative orientation, proximity, attaching (or detaching) devices, and continuous movement into and out of ad-hoc ensembles can drive context-sensitive interactions. Yet all devices remain self-stable in useful configurations even when released in mid-air. We explore flexible physical arrangement, feedforward of transition options, and layering of devices in-air across a variety of multi-device app scenarios. These include video conferencing with flexible arrangement of the person-space of multiple remote participants around a shared task-space, layered and tiled device formations with overview+detail and shared-to-personal transitions, and flexible composition of UI panels and tool palettes across devices for productivity applications. A preliminary interview study highlights user reactions to AirConstellations, such as for minimally disruptive device formations, easier physical transitions, and balancing "seeing and being seen" in remote work.2021NMNicolai Marquardt et al.Head-Up Display (HUD) & Advanced Driver Assistance Systems (ADAS)Knowledge Management & Team AwarenessUbiquitous ComputingUIST
Sketchnote Components, Design Space Dimensions, and Strategies for Effective Visual Note TakingSketchnoting is a form of visual note taking where people listen to, synthesize, and visualize ideas from a talk or other event using a combination of pictures, diagrams, and text. Little is known about the design space of this kind of visual note taking. With an eye towards informing the implementation of digital equivalents of sketchnoting, inking, and note taking, we introduce a classification of sketchnote styles and techniques, with a qualitative analysis of 103 sketchnotes, and situated in context with six semi-structured follow up interviews. Our findings distill core sketchnote components (content, layout, structuring elements, and visual styling) and dimensions of the sketchnote design space, classifying levels of conciseness, illustration, structure, personification, cohesion, and craftsmanship. We unpack strategies to address particular note taking challenges, for example dealing with constraints of live drawings, and discuss relevance for future digital inking tools, such as recomposition, styling, and design suggestions.2021RZRebecca Zheng et al.University College London, MumbliInteractive Data VisualizationData StorytellingUser Research Methods (Interviews, Surveys, Observation)CHI
SurfaceFleet: Exploring Distributed Interactions Unbounded from Device, Application, User, and TimeKnowledge work increasingly spans multiple computing surfaces. Yet in status quo user experiences, content as well as tools, behaviors, and workflows are largely bound to the current device—running the current application, for the current user, and at the current moment in time. SurfaceFleet is a system and toolkit that uses resilient distributed programming techniques to explore cross-device interactions that are unbounded in these four dimensions of device, application, user, and time. As a reference implementation, we describe an interface built using Surface Fleet that employs lightweight, semi-transparent UI elements known as Applets. Applets appear always-on-top of the operating system, application windows, and (conceptually) above the device itself. But all connections and synchronized data are virtualized and made resilient through the cloud. For example, a sharing Applet known as a Portfolio allows a user to drag and drop unbound Interaction Promises into a document. Such promises can then be fulfilled with content asynchronously, at a later time (or multiple times), from another device, and by the same or a different user.2020FBFrederik Brudy et al.Distributed Team CollaborationKnowledge Worker Tools & WorkflowsUIST
Tilt-Responsive Techniques for Digital Drawing BoardsDrawing boards offer a self-stable work surface that is continuously adjustable. On digital displays, such as the Microsoft Surface Studio, these properties open up a class of techniques that sense and respond to tilt adjustments. Each display posture—whether angled high, low, or somewhere in-between—affords some activities, but not others. Because what is appropriate also depends on the application and task, we explore a range of app-specific transitions between reading vs. writing (annotation), public vs. personal, shared person-space vs. task-space, and other nuances of input and feedback, contingent on display angle. Continuous responses provide interactive transitions tailored to each use-case. We show how a variety of knowledge work scenarios can use sensed display adjustments to drive context-appropriate transitions, as well as technical software details of how to best realize these concepts. A preliminary remote user study suggests that techniques must balance effort required to adjust tilt, versus the potential benefits of a sensed transition.2020HRHugo Romat et al.Knowledge Worker Tools & WorkflowsNotification & Interruption ManagementUIST
InChorus: Designing Consistent Multimodal Interactions for Data Visualization on Tablet DevicesWhile tablet devices are a promising platform for data visualization, supporting consistent interactions across different types of visualizations on tablets remains an open challenge. In this paper, we present multimodal interactions that function consistently across different visualizations, supporting common operations during visual data analysis. By considering standard interface elements (e.g., axes, marks) and grounding our design in a set of core concepts including operations, parameters, targets, and instruments, we systematically develop interactions applicable to different visualization types. To exemplify how the proposed interactions collectively facilitate data exploration, we employ them in a tablet-based system, InChorus that supports pen, touch, and speech input. Based on a study with 12 participants performing replication and factchecking tasks with InChorus, we discuss how participants adapted to using multimodal input and highlight considerations for future multimodal visualization systems.2020ASArjun Srinivasan et al.Microsoft Research & Georgia Institute of TechnologyVoice User Interface (VUI) DesignInteractive Data VisualizationNotification & Interruption ManagementCHI
Dear Pictograph: Investigating the Role of Personalization and Immersion for Consuming and Enjoying VisualizationsMuch of the visualization literature focuses on assessment of visual representations with regard to their effectiveness for understanding data. In the present work, we instead focus on making data visualization experiences more enjoyable, to foster deeper engagement with data. We investigate two strategies to make visualization experiences more enjoyable and engaging: personalization, and immersion. We selected pictographs (composed of multiple data glyphs) as this representation affords creative freedom, allowing people to craft symbolic or whimsical shapes of personal significance to represent data. We present the results of a qualitative study with 12 participants crafting pictographs using a large pen-enabled device and while immersed within a VR environment. Our results indicate that personalization and immersion both have positive impact on making visualizations more enjoyable experiences.2020HRHugo Romat et al.Université Paris-Saclay, CNRS, Inria, LRI & Microsoft ResearchImmersion & Presence ResearchData StorytellingVisualization Perception & CognitionCHI
SpaceInk: Making Space for In-Context AnnotationsWhen editing or reviewing a document, people directly overlay ink marks on content. For instance, they underline words, or circle elements in a figure. These overlay marks often accompany in-context annotations in the form of handwritten footnotes and marginalia. People tend to put annotations close to the content that elicited them, but have to compose with the often-limited whitespace. We introduce SpaceInk, a design space of pen+touch techniques that make room for in-context annotations by dynamically reflowing documents. We identify representative techniques in this design space, spanning both new ones and existing ones. We evaluate them in a user study, with results that inform the design of a prototype system. Our system lets users concentrate on capturing fleeting thoughts, streamlining the overall annotation process by enabling the fluid inverleaving of space-making gestures with freeform ink.2019HRHugo Romat et al.Knowledge Worker Tools & WorkflowsPrototyping & User TestingUIST
Sensing Posture-Aware Pen+Touch Interaction on TabletsMany status-quo interfaces for tablets with pen + touch input capabilities force users to reach for device-centric UI widgets at fixed locations, rather than sensing and adapting to the user-centric posture. To address this problem, we propose sensing techniques that transition between various nuances of mobile and stationary use via postural awareness. These postural nuances include shifting hand grips, varying screen angle and orientation, planting the palm while writing or sketching, and detecting what direction the hands approach from. To achieve this, our system combines three sensing modalities: 1) raw capacitance touchscreen images, 2) inertial motion, and 3) electric field sensors around the screen bezel for grasp and hand proximity detection. We show how these sensors enable posture-aware pen+touch techniques that adapt interaction and morph user interface elements to suit fine-grained contexts of body-, arm-, hand-, and grip-centric frames of reference.2019YZYang Zhang et al.Microsoft Research & Carnegie Mellon UniversityHand Gesture RecognitionHuman Pose & Activity RecognitionCHI
HoloDoc: Enabling Mixed Reality Workspaces that Harness Physical and Digital ContentPrior research identified that physical paper documents have many positive attributes, for example natural tangibility and inherent physical flexibility. When documents are presented on digital devices, however, they can provide unique functionality to users, such as the ability to search, view dynamic multimedia content, and make use of indexing. This work explores the fusion of physical and digital paper documents. It first presents the results of a study that probed how users perform document-intensive analytical tasks when both physical and digital versions of documents were available. The study findings then informed the design of HoloDoc, a mixed reality system that augments physical artifacts with rich interaction and dynamic virtual content. Finally, we present the interaction techniques that HoloDoc affords, and the results of a second study that assessed HoloDoc's utility when working with digital and physical copies of academic articles.2019ZLZhen Li et al.University of TorontoMixed Reality WorkspacesInteractive Data VisualizationCHI
ActiveInk: (Th)Inking with DataDuring sensemaking, people annotate insights: underlining sentences in a document or circling regions on a map. They jot down their hypotheses: drawing correlation lines on scatterplots or creating personal legends to track patterns. We present ActiveInk, a system enabling people to seamlessly transition between exploring data and externalizing their thoughts using pen and touch. ActiveInk enables the natural use of pen for active reading behaviors, while supporting analytic actions by activating any of these ink strokes. Through a qualitative study with eight participants, we contribute observations of active reading behaviors during data exploration and design principles to support sensemaking.2019HRHugo Romat et al.Microsoft ResearchInteractive Data VisualizationData StorytellingCHI
DataToon: Drawing Dynamic Network Comics With Pen + Touch InteractionComics are an entertaining and familiar medium for presenting compelling stories about data. However, existing visualization authoring tools do not leverage this expressive medium. In this paper, we seek to incorporate elements of comics into the construction of data-driven stories about dynamic networks. We contribute DataToon, a flexible data comic storyboarding tool that blends analysis and presentation with pen and touch interactions. A storyteller can use DataToon rapidly generate visualization panels, annotate them, and position them within a canvas to produce a visually compelling narrative. In a user study, participants quickly learned to use DataToon for producing data comics.2019NKNam Wook Kim et al.Microsoft Research & Harvard UniversityInteractive Data VisualizationData StorytellingCreative Coding & Computational ArtCHI