Human Delegation Behavior in Human-AI Collaboration: The Effect of Contextual InformationThe integration of artificial intelligence (AI) into human decision-making processes at the workplace presents both opportunities and challenges. One promising approach to leverage existing complementary capabilities is allowing humans to delegate individual instances of decision tasks to AI. However, enabling humans to delegate instances effectively requires them to assess several factors. One key factor is the analysis of both their own capabilities and those of the AI in the context of the given task. In this work, we conduct a behavioral study to explore the effects of providing contextual information to support this delegation decision. Specifically, we investigate how contextual information about the AI and the task domain influence humans' delegation decisions to an AI and their impact on the human-AI team performance. Our findings reveal that access to contextual information significantly improves human-AI team performance in delegation settings. Finally, we show that the delegation behavior changes with the different types of contextual information. Overall, this research advances the understanding of computer-supported, collaborative work and provides actionable insights for designing more effective collaborative systems.2025PSPhilipp Spitzer et al.Working with AICSCW
Authoring LLM-Based Assistance for Real-World Contexts and TasksAdvances in AI hold the possibility of assisting users with highly varied and individual needs, but the breadth of assistance that these systems could provide creates a challenge for how users specify their goals to the system. To support the authoring of AI assistance for real-world tasks, we propose the concept of Contextually-Driven Prompts (CDPs) that define how an AI assistant should respond to real-world context. We implemented a prototype system for authoring and executing CDPs, which provides suggestions to assist users with finding the right level of assistance for their goal. We also conducted a user study (N=10) to investigate how participants express and refine their goals for real-world tasks. Results revealed a number of strategies for initiating and refining CDPs with suggestions, and implications for the design of future authoring interfaces.2025HDHai Dang et al.Human-LLM CollaborationContext-Aware ComputingIUI
Exploring Mobile Touch Interaction with Large Language ModelsInteracting with Large Language Models (LLMs) for text editing on mobile devices currently requires users to break out of their writing environment and switch to a conversational AI interface. In this paper, we propose to control the LLM via touch gestures performed directly on the text. We first chart a design space that covers fundamental touch input and text transformations. In this space, we then concretely explore two control mappings: spread-to-generate and pinch-to-shorten, with visual feedback loops. We evaluate this concept in a user study (N=14) that compares three feedback designs: no visualisation, text length indicator, and length + word indicator. The results demonstrate that touch-based control of LLMs is both feasible and user-friendly, with the length + word indicator proving most effective for managing text generation. This work lays the foundation for further research into gesture-based interaction with LLMs on touch devices.2025TZTim Zindulka et al.University of BayreuthHand Gesture RecognitionHuman-LLM CollaborationCHI
You Shall Not Pass: Warning Drivers of Unsafe Overtaking Maneuvers on Country Roads by Predicting Safe Sight DistanceOvertaking on country roads with possible opposing traffic is a dangerous maneuver and many proposed assistant systems assume car-to-car communication and sensors currently unavailable in cars. To overcome this limitation, we develop an assistant that uses simple in-car sensors to predict the required sight distance for safe overtaking. Our models predict this from vehicle speeds, accelerations, and 3D map data. In a user study with a Virtual Reality driving simulator (N=25), we compare two UI variants (monitoring-focused vs scheduling-focused). The results reveal that both UIs enable more patient driving and thus increase overall driving safety. While the monitoring-focused UI achieves higher System Usability Score and distracts drivers less, the preferred UI depends on personal preference. Driving data shows predictions were off at times. We investigate and discuss this in a comparison of our models to actual driving behavior and identify crucial model parameters and assumptions that significantly improve model predictions.2025ABAdrian Bauske et al.University of BayreuthAutomated Driving Interface & Takeover DesignHead-Up Display (HUD) & Advanced Driver Assistance Systems (ADAS)CHI
CorpusStudio: Surfacing Emergent Patterns In A Corpus Of Prior Work While WritingMany communities, including the scientific community, develop implicit writing norms. Understanding them is crucial for effective communication with that community. Writers gradually develop an implicit understanding of norms by reading papers and receiving feedback on their writing. However, it is difficult to both externalize this knowledge and apply it to one's own writing. We propose two new writing support concepts that reify document and sentence-level patterns in a given text corpus: (1) an ordered distribution over section titles and (2) given the user's draft and cursor location, many retrieved contextually relevant sentences. Recurring words in the latter are algorithmically highlighted to help users see any emergent norms. Study results (N=16) show that participants revised the structure and content using these concepts, gaining confidence in aligning with or breaking norms after reviewing many examples. These results demonstrate the value of reifying distributions over other authors’ writing choices during the writing process.2025HDHai Dang et al.University of Bayreuth, HCI+AIAI-Assisted Creative WritingCreative Collaboration & Feedback SystemsCHI
Content-Driven Local Response: Supporting Sentence-Level and Message-Level Mobile Email Replies With and Without AIMobile emailing demands efficiency in diverse situations, which motivates the use of AI. However, generated text does not always reflect how people want to respond. This challenges users with AI involvement tradeoffs not yet considered in email UIs. We address this with a new UI concept called Content-Driven Local Response (CDLR), inspired by microtasking. This allows users to insert responses into the email by selecting sentences, which additionally serves to guide AI suggestions. The concept supports combining AI for local suggestions and message-level improvements. Our user study (N=126) compared CDLR with manual typing and full reply generation. We found that CDLR supports flexible workflows with varying degrees of AI involvement, while retaining the benefits of reduced typing and errors. This work contributes a new approach to integrating AI capabilities: By redesigning the UI for workflows with and without AI, we can empower users to dynamically adjust AI involvement.2025TZTim Zindulka et al.University of BayreuthVoice User Interface (VUI) DesignHuman-LLM CollaborationCHI
The Impact of Imperfect XAI on Human-AI Decision-MakingExplainability techniques are rapidly being developed to improve human-AI decision-making across various cooperative work settings. Consequently, previous research has evaluated how decision-makers collaborate with imperfect AI by investigating appropriate reliance and task performance with the aim of designing more human-centered computer-supported collaborative tools. Several human-centered explainable AI (XAI) techniques have been proposed in hopes of improving decision-makers' collaboration with AI; however, these techniques are grounded in findings from previous studies that primarily focus on the impact of incorrect AI advice. Few studies acknowledge the possibility for the explanations to be incorrect even if the AI advice is correct. Thus, it is crucial to understand how imperfect XAI affects human-AI decision-making. In this work, we contribute a robust, mixed-methods user study with 136 participants to evaluate how incorrect explanations influence humans' decision-making behavior in a bird species identification task taking into account their level of expertise and an explanation's level of assertiveness. Our findings reveal the influence of imperfect XAI and humans' level of expertise on their reliance on AI and human-AI team performance. We also discuss how explanations can deceive decision-makers during human-AI collaboration. Hence, we shed light on the impacts of imperfect XAI in the field of computer-supported cooperative work and provide guidelines for designers of human-AI collaboration systems.2024KMKatelyn Morrison et al.Session 3e: Trust and Understanding in Explainable AICSCW
SIM2VR: Towards Automated Biomechanical Testing in VRAutomated biomechanical testing has great potential for the development of VR applications, as initial insights into user behaviour can be gained in silico early in the design process. In particular, it allows prediction of user movements and ergonomic variables, such as fatigue, prior to conducting user studies. However, there is a fundamental disconnect between simulators hosting state-of-the-art biomechanical user models and simulators used to develop and run VR applications. Existing user simulators often struggle to capture the intricacies of real-world VR applications, reducing ecological validity of user predictions. In this paper, we introduce SIM2VR, a system that aligns user simulation with a given VR application by establishing a continuous closed loop between the two processes. This, for the first time, enables training simulated users directly in the same VR application that real users interact with. We demonstrate that SIM2VR can predict differences in user performance, ergonomics and strategies in a fast-paced, dynamic arcade game. In order to expand the scope of automated biomechanical testing beyond simple visuomotor tasks, advances in cognitive models and reward function design will be needed.2024FFFlorian Fischer et al.Human Pose & Activity RecognitionVR Medical Training & RehabilitationUIST
The AI Ghostwriter Effect: When Users do not Perceive Ownership of AI-Generated Text but Self-Declare as AuthorsHuman-AI interaction in text production increases complexity in authorship. In two empirical studies (n1 = 30 & n2 = 96), we investigate authorship and ownership in human-AI collaboration for personalized language generation. We show an AI Ghostwriter Effect: Users do not consider themselves the owners and authors of AI-generated text but refrain from publicly declaring AI authorship. Personalization of AI-generated texts did not impact the AI Ghostwriter Effect, and higher levels of participants’ influence on texts increased their sense of ownership. Participants were more likely to attribute ownership to supposedly human ghostwriters than AI ghostwriters, resulting in a higher ownership-authorship discrepancy for human ghostwriters. Rationalizations for authorship in AI ghostwriters and human ghostwriters were similar. We discuss how our findings relate to psychological ownership and human-AI interaction to lay the foundations for adapting authorship frameworks and user interfaces in AI in text-generation tasks.2024FDFiona Draxler et al.Generative AI (Text, Image, Music, Video)AI Ethics, Fairness & AccountabilityAI-Assisted Creative WritingDIS
Collage is the New Writing: Exploring the Fragmentation of Text and User Interfaces in AI ToolsThis essay proposes and explores the concept of Collage for the design of AI writing tools, which we transfer from avant-garde literature with four facets: 1) fragmenting text in writing interfaces, 2) juxtaposing voices (content vs command), 3) integrating material from multiple sources (e.g. text suggestions), and 4) shifting from manual writing to editorial and compositional decision-making, such as selecting and arranging snippets. The essay then employs Collage as an analytical lens to analyse the user interface design of recent AI writing tools, and as a constructive lens to inspire new design directions. Finally, a critical perspective relates the concerns that writers historically expressed through literary collage to AI writing tools. In a broad view, this essay explores how literary concepts can help advance design theory around AI writing tools. It encourages creators of future writing tools to engage not only with new technological possibilities, but also with past writing innovations.2024DBDaniel BuschekGenerative AI (Text, Image, Music, Video)AI-Assisted Creative WritingDIS
Explanations, Fairness, and Appropriate Reliance in Human-AI Decision-MakingIn this work, we study the effects of feature-based explanations on distributive fairness of AI-assisted decisions, specifically focusing on the task of predicting occupations from short textual bios. We also investigate how any effects are mediated by humans' fairness perceptions and their reliance on AI recommendations. Our findings show that explanations influence fairness perceptions, which, in turn, relate to humans' tendency to adhere to AI recommendations. However, we see that such explanations do not enable humans to discern correct and incorrect AI recommendations. Instead, we show that they may affect reliance irrespective of the correctness of AI recommendations. Depending on which features an explanation highlights, this can foster or hinder distributive fairness: when explanations highlight features that are task-irrelevant and evidently associated with the sensitive attribute, this prompts overrides that counter AI recommendations that align with gender stereotypes. Meanwhile, if explanations appear task-relevant, this induces reliance behavior that reinforces stereotype-aligned errors. These results imply that feature-based explanations are not a reliable mechanism to improve distributive fairness.2024JSJakob Schoeffer et al.University of Texas at AustinExplainable AI (XAI)AI-Assisted Decision-Making & AutomationAI Ethics, Fairness & AccountabilityCHI
Writer-Defined AI Personas for On-Demand Feedback GenerationCompelling writing is tailored to its audience. This is challenging, as writers may struggle to empathize with readers, get feedback in time, or gain access to the target group. We propose a concept that generates on-demand feedback, based on writer-defined AI personas of any target audience. We explore this concept with a prototype (using GPT-3.5) in two user studies (N=5 and N=11): Writers appreciated the concept and strategically used personas for getting different perspectives. The feedback was seen as helpful and inspired revisions of text and personas, although it was often verbose and unspecific. We discuss the impact of on-demand feedback, the limited representativity of contemporary AI systems, and further ideas for defining AI personas. This work contributes to the vision of supporting writers with AI by expanding the socio-technical perspective in AI tool design: To empower creators, we also need to keep in mind their relationship to an audience.2024KBKarim Benharrak et al.University of Texas, Austin, University of BayreuthGenerative AI (Text, Image, Music, Video)AI-Assisted Creative WritingCHI
A Design Space for Intelligent and Interactive Writing AssistantsIn our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine and explore the multidimensional space of intelligent and interactive writing assistants. Through community collaboration, we explore five aspects of writing assistants: task, user, technology, interaction, and ecosystem. Within each aspect, we define dimensions and codes by systematically reviewing 115 papers while leveraging the expertise of researchers in various disciplines. Our design space aims to offer researchers and designers a practical tool to navigate, comprehend, and compare the various possibilities of writing assistants, and aid in the design of new writing assistants.2024MLMina Lee et al.Microsoft ResearchHuman-LLM CollaborationAI-Assisted Creative WritingCreative Collaboration & Feedback SystemsCHI
WorldSmith: A Multi-Modal Image Synthesis Tool for Fictional World BuildingCrafting a rich and unique environment is crucial for fictional world-building, but can be difficult to achieve since illustrating a world from scratch requires time and significant skill. We investigate the use of recent multi-modal image generation systems to enable users iteratively visualize and modify elements of their fictional world using a combination of text input, sketching, and region-based filling. WorldSmith enables novice world builders to quickly visualize a fictional world with layered edits and hierarchical compositions. Through a formative study (4 participants) and first-use study (13 participants) we demonstrate that WorldSmith offers more expressive interactions with prompt-based models. With this work, we explore how creatives can be empowered to leverage prompt-based generative AI as a tool in their creative process, beyond current "click-once" prompting UI paradigms.2023HDHai Duong Dang et al.Generative AI (Text, Image, Music, Video)AI-Assisted Creative WritingGraphic Design & Typography ToolsUIST
Typing Behavior is About More than Speed: Users' Strategies for Choosing Word Suggestions Despite Slower Typing RatesMobile word suggestions can slow down typing, yet are still widely used. To investigate the apparent benefits beyond speed, we analyzed typing behavior of 15,162 users of mobile devices. Controlling for natural typing speed (a confounding factor not considered by prior work), we statistically show that slower typists use suggestions more often but are slowed down by doing so. To better understand how these typists leverage suggestions -- if not to improve their speed -- we extract eight usage strategies, including completion, correction, and next-word prediction. We find that word characteristics, such as length or frequency, along with the strategy, are predictive of whether a user will select a suggestion. We show how to operationalize our findings by building and evaluating a predictive model of suggestion selection. Such a model could be used to augment existing suggestion algorithms to consider people's strategic use of word predictions beyond speed and keystroke savings.2023FLFlorian Lehmann et al.Intelligent Voice Assistants (Alexa, Siri, etc.)Agent Personality & AnthropomorphismMobileHCI
Point of no Undo: Irreversible Interactions as a Design StrategyDespite irreversibility being omnipresent in the lifeworld, research on interactions making use of irreversibility in computing systems is still in the early stages. User freedom – provided by the undo functionality – is considered to be a pillar of "usable" computer systems, overcoming irreversibility. Within this paper, we set up a thought experiment, challenging the "undo feature" and instead take advantage of irreversibility in the interaction with physical computing systems (tangibles, robots, etc). First, we present three material speculations, each inherently utilizing irreversibility. Second, we elaborate on the concept of irreversible interactions by contextualizing our work with critical HCI discourses and deducing three design strategies. Finally, we discuss irreversibility as a design element for self-reflection, meaningful acting, and a sustainable relationship with technology. While previously individual aspects of irreversibility have been explored, we contribute a comprehensive discussion of irreversible interactions in HCI presenting artifacts, a conceptualization, design strategies, and application purposes.2023BRBeat Rossmy et al.LMU MunichPrivacy by Design & User ControlDesign FictionSustainable HCICHI
Co-Writing with Opinionated Language Models Affects Users' ViewsIf large language models like GPT-3 preferably produce a particular point of view, they may influence people's opinions on an unknown scale. This study investigates whether a language-model-powered writing assistant that generates some opinions more often than others impacts what users write -- and what they think. In an online experiment, we asked participants (N=1,506) to write a post discussing whether social media is good for society. Treatment group participants used a language-model-powered writing assistant configured to argue that social media is good or bad for society. Participants then completed a social media attitude survey, and independent judges (N=500) evaluated the opinions expressed in their writing. Using the opinionated language model affected the opinions expressed in participants' writing and shifted their opinions in the subsequent attitude survey. We discuss the wider implications of our results and argue that the opinions built into AI language technologies need to be monitored and engineered more carefully.2023MJMaurice Jakesch et al.Cornell University, Cornell TechHuman-LLM CollaborationAI Ethics, Fairness & AccountabilityAlgorithmic Transparency & AuditabilityCHI
Choice Over Control: How Users Write with Large Language Models using Diegetic and Non-Diegetic PromptingWe propose a conceptual perspective on prompts for Large Language Models (LLMs) that distinguishes between (1) diegetic prompts (part of the narrative, e.g. “Once upon a time, I saw a fox...”), and (2) non-diegetic prompts (external, e.g. “Write about the adventures of the fox.”). With this lens, we study how 129 crowd workers on Prolific write short texts with different user interfaces (1 vs 3 suggestions, with/out non-diegetic prompts; implemented with GPT-3): When the interface offered multiple suggestions and provided an option for diegetic prompting, participants preferred choosing from multiple suggestions over controlling them via non-diegetic prompts. When participants provided non-diegetic prompts it was to ask for inspiration, topics or facts. Single suggestions in particular were guided both with diegetic and non-diegetic information. This work informs human-AI interaction with generative models by revealing that (1) writing non-diegetic prompts requires effort, (2) people combine diegetic and non-diegetic prompting, and (3) they use their draft (i.e. diegetic information) and suggestion timing to strategically guide LLMs.2023HDHai Dang et al.University of BayreuthHuman-LLM CollaborationAI-Assisted Creative WritingCHI
Breathing Life Into Biomechanical User ModelsForward biomechanical simulation in HCI holds great promise as a tool for evaluation, design, and engineering of user interfaces. Although reinforcement learning (RL) has been used to simulate biomechanics in interaction, prior work has relied on unrealistic assumptions about the control problem involved, which limits the plausibility of emerging policies. These assumptions include direct torque actuation as opposed to muscle-based control; direct, privileged access to the external environment, instead of imperfect sensory observations; and lack of interaction with physical input devices. In this paper, we present a new approach for learning muscle-actuated control policies based on perceptual feedback in interaction tasks with physical input devices. This allows modelling of more realistic interaction tasks with cognitively plausible visuomotor control. We show that our simulated user model successfully learns a variety of tasks representing different interaction methods, and that the model exhibits characteristic movement regularities observed in studies of pointing. We provide an open-source implementation which can be extended with further biomechanical models, perception models, and interactive environments.2022AIAleksi Ikkala et al.Human Pose & Activity RecognitionComputational Methods in HCIUIST
Beyond Text Generation: Supporting Writers with Continuous Automatic Text Summaries.We propose a text editor to help users plan, structure and reflect on their writing process. It provides continuously updated paragraph-wise summaries as margin annotations, using automatic text summarization. Summary levels range from full text, to selected (central) sentences, down to a collection of keywords. To understand how users interact with this system during writing, we conducted two user studies (N=4 and N=8) in which people wrote analytic essays about a given topic and article. As a key finding, the summaries gave users an external perspective on their writing and helped them to revise the content and scope of their drafted paragraphs. People further used the tool to quickly gain an overview of the text and developed strategies to integrate insights from the automated summaries. More broadly, this work explores and highlights the value of designing AI tools for writers, with Natural Language Processing (NLP) capabilities that go beyond direct text generation and correction.2022HDHai Dang et al.Human-LLM CollaborationAI-Assisted Creative WritingUIST