Belief Updating and Delegation in Multi-Task Human–AI Interaction: Evidence from Controlled SimulationsLarge language models (LLMs) increasingly support heterogeneous tasks within a single interface, requiring users to form, update, and act upon beliefs about one system across domains with different reliability profiles. Understanding how such beliefs transfer across tasks and shape delegation is critical for the design of multipurpose AI systems. We report a preregistered experiment (N = 240, 7,200 trials) in which participants interacted with a controlled AI simulation across grammar checking, travel planning, and visual question answering. Delegation was operationalized as a binary reliance decision—accepting the AI’s output versus acting independently—and belief dynamics were evaluated against Bayesian benchmarks. We find three main results. First, participants do not reset beliefs between tasks, instead carrying expectations from prior interactions. Second, within tasks, belief updating follows the Bayesian direction but is substantially conservative. Third, delegation is driven primarily by subjective beliefs about AI accuracy rather than self-confidence, though confidence independently reduces reliance when beliefs are held constant. Based on these results, we discuss implications for expectation calibration, reliance design, and the risks of belief spillovers in deployed LLM-based interfaces.2026SBShreyan Biswas et al.Technical University of DelftHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationExplainable AI (XAI)CHI
When Life Gives You AI, Will You Turn It Into A Market for Lemons? Understanding How Information Asymmetries About AI System Capabilities Affect Market Outcomes and AdoptionAI consumer markets are characterized by severe buyer-supplier market asymmetries. Complex AI systems can appear highly accurate while making costly errors or embedding hidden defects. While there have been regulatory efforts surrounding different forms of disclosure, large information gaps remain. This paper provides the first experimental evidence on the important role of information asymmetries and disclosure designs in shaping user adoption of AI systems. We systematically vary the density of low-quality AI systems and the depth of disclosure requirements in a simulated AI product market to gauge how people react to the risk of accidentally relying on a low-quality AI system. Then, we compare participants’ choices to a rational Bayesian model, analyzing the degree to which partial information disclosure can improve AI adoption. Our results underscore the deleterious effects of information asymmetries on AI adoption, but also highlight the potential of partial disclosure designs to improve the overall efficiency of human decision-making.2026AEAlexander Erlei et al.University of GoettingenExplainable AI (XAI)AI Ethics, Fairness & AccountabilityAlgorithmic Transparency & AuditabilityCHI
The Data-Dollars Tradeoff: Privacy Harms vs. Economic Risk in Personalized AI AdoptionPrivacy concerns significantly impact AI adoption, yet little is known about how information environments shape user responses to data leak threats. We conducted a 2 x 3 between-subjects experiment (N=610) examining how risk versus ambiguity about privacy leaks affects the adoption of AI personalization. Participants chose between standard and AI-personalized product baskets, with personalization requiring data sharing that could leak to pricing algorithms. Under risk (30% leak probability), we found no difference in AI adoption between privacy-threatening and neutral conditions (ca. 50% adoption). Under ambiguity (10-50% range), privacy threats significantly reduced adoption compared to neutral conditions. This effect holds for sensitive demographic data as well as anonymized preference data. Users systematically over-bid for privacy disclosure labels, suggesting strong demand for transparency institutions. Notably, privacy leak threats did not affect subsequent bargaining behavior with algorithms. Our findings indicate that ambiguity over data leaks, rather than only privacy preferences per se, drives avoidance behavior among users towards personalized AI.2026AEAlexander Erlei et al.University of GoettingenPrivacy Perception & Decision-MakingExplainable AI (XAI)AI-Assisted Decision-Making & AutomationCHI
Mind the Gap! Choice Independence in Using Multilingual LLMs for Persuasive Co-Writing Tasks in Different LanguagesRecent advances in generative AI have precipitated a proliferation of novel writing assistants. These systems typically rely on multilingual large language models (LLMs), providing globalized workers the ability to revise or create diverse forms of content in different languages. However, there is substantial evidence indicating that the performance of multilingual LLMs varies between languages. Users who employ writing assistance for multiple languages are therefore susceptible to disparate output quality. Importantly, recent research has shown that people tend to generalize algorithmic errors across independent tasks, violating the behavioral axiom of choice independence. In this paper, we analyze whether user utilization of novel writing assistants in a charity advertisement writing task is affected by the AI's performance in a second language. Furthermore, we quantify the extent to which these patterns translate into the persuasiveness of generated charity advertisements, as well as the role of peoples’ beliefs about LLM utilization for their donation choices. Our results provide evidence that writers who engage with an LLM-based writing assistant violate choice independence, as prior exposure to a Spanish LLM reduces subsequent utilization of an English LLM. While these patterns do not affect the aggregate persuasiveness of the generated advertisements, people's beliefs about the source of an advertisement (human versus AI) do. In particular, Spanish-speaking female participants who believed that they read an AI-generated advertisement strongly adjusted their donation behavior downwards. Furthermore, people are generally not able to adequately differentiate between human-generated and LLM-generated ads. Our work has important implications on the design, development, integration, and adoption of multilingual LLMs as assistive agents—particularly in writing tasks.2025SBShreyan Biswas et al.Technical University of DelftMultilingual & Cross-Cultural Voice InteractionGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationCHI
Understanding Choice Independence and Error Types in Human-AI CollaborationThe ability to make appropriate delegation decisions is an important prerequisite of effective human-AI collaboration. Recent work, however, has shown that people struggle to evaluate AI systems in the presence of forecasting errors, falling well short of relying on AI systems appropriately. We use a pre-registered crowdsourcing study ($N=611$) to extend this literature by two underexplored crucial features of human AI decision-making: \textit{choice independence} and \textit{error type}. Subjects in our study repeatedly complete two prediction tasks and choose which predictions they want to delegate to an AI system. For one task, subjects receive a decision heuristic that allows them to make informed and relatively accurate predictions. The second task is substantially harder to solve, and subjects must come up with their own decision rule. We systematically vary the AI system's performance such that it either provides the best possible prediction for both tasks or only for one of the two. Our results demonstrate that people systematically violate choice independence by taking the AI's performance in an unrelated second task into account. Humans who delegate predictions to a superior AI in their own expertise domain significantly reduce appropriate reliance when the model makes systematic errors in a complementary expertise domain. In contrast, humans who delegate predictions to a superior AI in a complementary expertise domain significantly increase appropriate reliance when the model systematically errs in the human expertise domain. Furthermore, we show that humans differentiate between error types and that this effect is conditional on the considered expertise domain. This is the first empirical exploration of choice independence and error types in the context of human-AI collaboration. Our results have broad and important implications for the future design, deployment, and appropriate application of AI systems.2024AEAlexander Erlei et al.University of GoettingenAI-Assisted Decision-Making & AutomationAlgorithmic Transparency & AuditabilityCHI
Children’s Word Learning from Socially Contingent Robots Under Active vs. Passive Learning ConditionsLanguage is learned through social interactions, in which gaze has a special role because it can be used to guide the attention and reference objects easily. Children, starting from very early ages, are also very good at utilizing gaze to map labels to referenced objects. To achieve language teaching robots, we need to understand how these functions of gaze can be implemented most efficiently. To this aim, we allowed children to interact with a social robot to learn the labels of several objects in a naturalistic setting. In some trials the child guided the gaze and chose the object to be learned while the robot was following and in the others they changed the roles and robot guided the gaze and decided on the object to be learned. We measured how much children actually followed the robot’s gaze and how many words they learned in these two conditions, referred to as active and passive learning conditions, respectively. The results indicate that although children followed the robot's gaze and learned words successfully, there were no meaningful differences in word learning between the two conditions. The rate of gaze following and time spent looking at the robot did not influence word learning, either. The implications of these results for use of robots in educational settings are further discussed.2024FSFatih Sivridag et al.Collaborative Learning & Peer TeachingSpecial Education TechnologySocial Robot InteractionHRI
For What It's Worth: Humans Overwrite Their Economic Self-Interest to Avoid Bargaining With AI SystemsAs algorithms are increasingly augmenting and substituting human decision-making, understanding how the introduction of computational agents changes the fundamentals of human behavior becomes vital. This pertains to not only users, but also those parties who face the consequences of an algorithmic decision. In a controlled experiment with 480 participants, we exploit an extended version of two-player ultimatum bargaining where responders choose to bargain with either another human, another human with an AI decision aid or an autonomous AI-system acting on behalf of a passive human proposer. Our results show strong responder preferences against the algorithm, as most responders opt for a human opponent and demand higher compensation to reach a contract with autonomous agents. To map these preferences to economic expectations, we elicit incentivized subject beliefs about their opponent’s behavior. The majority of responders maximize their expected value when this is line with approaching the human proposer. In contrast, responders predicting income maximization for the autonomous AI-system overwhelmingly override economic self-interest to avoid the algorithm.2022AEAlexander Erlei et al.University of GoettingenAI-Assisted Decision-Making & AutomationAI Ethics, Fairness & AccountabilityCHI
Sara, the Lecturer: Improving Learning in Online Education with a Scaffolding-Based Conversational AgentEnrollment in online courses has sharply increased in higher education. Although online education can be scaled to large audiences, the lack of interaction between educators and learners is difficult to replace and remains a primary challenge in the field. Conversational agents may alleviate this problem by engaging in natural interaction and by scaffolding learners' understanding similarly to educators. However, whether this approach can also be used to enrich online video lectures has largely remained unknown. We developed Sara, a conversational agent that appears during an online video lecture. She provides scaffolds by voice and text when needed and includes a voice-based input mode. An evaluation with 182 learners in a 2 x 2 lab experiment demonstrated that Sara, compared to more traditional conversational agents, significantly improved learning in a programming task. This study highlights the importance of including scaffolding and voice-based conversational agents in online videos to improve meaningful learning.2020RWRainer Winkler et al.University of St. GallenConversational ChatbotsOnline Learning & MOOC PlatformsIntelligent Tutoring Systems & Learning AnalyticsCHI