Transparent Conversational Agents: The Impact of Capability Communication on User Behavior and Mental Model AlignmentWhen a user interacts with a conversational agent for the first time, they may not be aware of the agent's capabilities, leading to suboptimal use or interaction breakdowns. To avoid a mismatch with the actual capabilities, the agent's capabilities have to be made transparent to the user. To investigate whether communication of an agent's capabilities during interactions enhances transparency and improves the user's mental model, we conducted a user study with 56 participants. Each participant had three speech-based interactions with an agent that communicated its capabilities or an agent that did not. Our results suggest that the communication led to a change in user behavior with significantly longer utterances. However, the users' mental models of the agent's capabilities were not significantly different between the conditions. Participants were able to significantly improve their knowledge of the agent's capabilities by aligning their mental model over time in both conditions.2025MRMerle M. Reimann et al.Agent Personality & AnthropomorphismExplainable AI (XAI)Privacy by Design & User ControlCUI
Seeing Eye to AI? Applying Deep-Feature-Based Similarity Metrics to Information VisualizationJudging the similarity of visualizations is crucial to various applications, such as visualization-based search and visualization recommendation systems. Recent studies show deep-feature-based similarity metrics correlate well with perceptual judgments of image similarity and serve as effective loss functions for tasks like image super-resolution and style transfer. We explore the application of such metrics to judgments of visualization similarity. We extend a similarity metric using five ML architectures and three pre-trained weight sets. We replicate results from previous crowdsourced studies on scatterplot and visual channel similarity perception. Notably, our metric using pre-trained ImageNet weights outperformed gradient-descent tuned MS-SSIM, a multi-scale similarity metric based on luminance, contrast, and structure. Our work contributes to understanding how deep-feature-based metrics can enhance similarity assessments in visualization, potentially improving visual analysis tools and techniques. Supplementary materials are available at https://osf.io/dj2ms/.2025SLSheng Long et al.Northwestern University, Computer ScienceRecommender System UXInteractive Data VisualizationVisualization Perception & CognitionCHI
Unhealthy Comparisons to Promote Healthy Behavior? Exploring the Impact of Social Comparison Strategies in Personal Informatics.Previous work on Social Comparison Theory shows that comparing oneself to others can lead to negative self-perceptions and rumination, reducing self-confidence. Despite these harmful effects, social comparisons are frequently used as engagement strategies in personal informatics systems, such as health and wellness apps. There is limited understanding of how users perceive these comparisons and their impact on wellbeing. To address this, we reviewed the Top 50 Health & Wellness smartphone applications to analyse implemented comparison strategies and the metrics such comparisons are used for. We conducted an online vignette study (n=192) and an interview study (n=12) to further explore the impact of social comparisons on users. Our study shows that comparisons in personal informatics motivate users but simultaneously lead to negative emotions (e.g., inferiority, disappointment), potentially leading to obsessive thoughts and overtraining. Based on our findings, we propose design guidelines for implementing social comparison features that prioritise users’ wellbeing.2025DZDaphne van Zandvoort et al.Utrecht UniversityFitness Tracking & Physical Activity MonitoringSleep & Stress MonitoringCHI
Progression Balancing × Baldur’s Gate 3: Insights, Terms and Tools for Multi-Dimensional Video Game BalanceInternal game balancing is one of the major components that affect player experience, as it is responsible for a large share of development time, the majority of game update patches and long-term player satisfaction. This makes tools and methodologies of assessing and advancing game balance a valuable endeavor for industry and academia. During the past decades, scientific research produced numerous outputs to inform and enhance game balancing, yet most of them only adhere to a single dimension of balance: fixed (end-game) scenarios. However, games are usually experienced throughout a continuous spectrum of ever-changing constellations, which should be reflected. Using simulation, game-playing AI, visual analytics and informative metrics, we introduce a methodology and implementation of Progression Balancing, incorporating multi-dimensional game aspects. For the sake of exposition and ecological validity, we applied it in one of the most successful recent games (Baldur's Gate 3), and evaluated its efficacy with help of its player community.2025JPJohannes PfauUtrecht UniversityGame UX & Player BehaviorSerious & Functional GamesCHI
Characterizing Photorealism and Artifacts in Diffusion Model-Generated ImagesDiffusion model-generated images can appear indistinguishable from authentic photographs, but these images often contain artifacts and implausibilities that reveal their AI-generated provenance. Given the challenge to public trust in media posed by photorealistic AI-generated images, we conducted a large-scale experiment measuring human detection accuracy on 450 diffusion-model generated images and 149 real images. Based on collecting 749,828 observations and 34,675 comments from 50,444 participants, we find that scene complexity of an image, artifact types within an image, display time of an image, and human curation of AI-generated images all play significant roles in how accurately people distinguish real from AI-generated images. Additionally, we propose a taxonomy characterizing artifacts often appearing in images generated by diffusion models. Our empirical observations and taxonomy offer nuanced insights into the capabilities and limitations of diffusion models to generate photorealistic images in 2024.2025NKNegar Kamali et al.Northwestern University, Computer ScienceGenerative AI (Text, Image, Music, Video)Explainable AI (XAI)Deepfake & Synthetic Media DetectionCHI
Haptic Biosignals Affect Proxemics Toward Virtual Reality AgentsEncounters with virtual agents currently lack the haptic viscerality of human contact. While digital biosignal communication can mediate such virtual social interactions, how artificial haptic biosignals influence users’ personal space during Virtual Reality (VR) experiences is unknown. Designing vibrotactile heartbeats and thermally-actuated body temperature, we ran a within-subjects study (N=31) to investigate feedback (Thermal, Vibration, Thermal+Vibration, None) and agent stories (Negative, Neutral, Positive) on objective and subjective interpersonal distance (IPD), perceived arousal and comfort, presence, and post-experience responses. Findings showed that thermal feedback decreased objective but not subjective IPD, whereas vibrotactile heartbeats (signaling agent's closeness) increased both while heightening arousal and discomfort. Agents' stories did not affect IPD, arousal, or comfort. Our qualitative findings shed light on signal ambiguity and presence constructs within VR-based haptic stimulation. We contribute insights into artificial biosignals and their influence on VR proxemics, with cautionary considerations should the boundaries blur between physical and virtual touch.2025SOSimone Ooms et al.Utrecht University, Human-Centred Computing; Centrum Wiskunde & Informatica, Distributed & Interactive SystemsVibrotactile Feedback & Skin StimulationSocial & Collaborative VRImmersion & Presence ResearchCHI
Developing a Social Support Framework: Understanding the Reciprocity in Human-Chatbot RelationshipChatbots are increasingly used to provide social support for individuals with mental health challenges. However, a systematic analysis of the types and directionality of support within chatbot use remains lacking. This study establishes a framework for understanding reciprocal social support exchanges in human-chatbot relationships, focusing on the popular chatbot, Replika. By analyzing 496 posts and 20,494 comments from the largest Replika community on Reddit, we identified 27 support subcategories, organized into five main types (functional, informational, emotional, esteem, and network) and two directions (chatbot-receiving and chatbot-giving). Our findings reveal significant yet controversial issues, such as subscription services and chatbot-displayed affection. Notably, "user teaching chatbot" emerged as a core aspect of the human-chatbot relationship, covering how users actively guide and refine the chatbot’s learning or algorithm. This study constructs a novel social support framework for chatbot use, highlighting the potential for reciprocal support exchanges between users and chatbots.2025SPShuyi Pan et al.Shanghai Jiao Tong University; Utrecht UniversityConversational ChatbotsAgent Personality & AnthropomorphismMental Health Apps & Online Support CommunitiesCHI
Changing Lanes Toward Open Science: Openness and Transparency in Automotive User ResearchWe review the state of open science and the perspectives on open data sharing within the automotive user research community. Openness and transparency are critical not only for judging the quality of empirical research, but also for accelerating scientific progress and promoting an inclusive scientific community. However, there is little documentation of these aspects within the automotive user research community. To address this, we report two studies that identify (1) community perspectives on motivators and barriers to data sharing, and (2) how openness and transparency have changed in papers published at AutomotiveUI over the past 5 years. We show that while open science is valued by the community and openness and transparency have improved, overall compliance is low. The most common barriers are legal constraints and confidentiality concerns. Although research published at AutomotiveUI relies more on quantitative methods than research published at CHI, openness and transparency are not as well established. Based on our findings, we provide suggestions for improving openness and transparency, arguing that the motivators for open science must outweigh the barriers. All supporting materials are freely available at: https://osf.io/zdpek/2024PEPatrick Ebel et al.Research Ethics & Open ScienceAutoUI
Text a Bit Longer or Drive Now? Resuming Driving after Texting in Conditionally Automated CarsIn this study, we focus on different strategies drivers use in terms of interleaving between driving and non-driving related tasks (NDRT) while taking back control from automated driving. We conducted two driving simulator experiments to examine how different cognitive demands of texting, priorities, and takeover time budgets affect drivers' takeover strategies. We also evaluated how different takeover strategies affect takeover performance. We found that the choice of takeover strategy was influenced by the priority and takeover time budget but not by the cognitive demand of the NDRT. The takeover strategy did not have any effect on takeover quality or NDRT engagement but influenced takeover timing.2024NCNabil Al Nahin Ch et al.Automated Driving Interface & Takeover DesignAutoUI
Requirements and Attitudes towards Explainable AI in Law EnforcementIn high-stakes areas such as law enforcement, where artificial intelligence has the potential to enhance effectiveness and inclusivity, its decisions must be both informed and accountable. Thus, designing explainable artificial intelligence (XAI) for such settings is a key social concern. Yet, explanations in practice are often overly technical or abstract. To address this, our study engaged with police employees in an EU country, who are users of a text classifier. We found that for them, usability and usefulness are paramount in explanation design, whereas interpretability and understandability are less emphasized. Drawing from these insights, we suggest design guidelines centred on clarity and relevance for domain experts. We contribute recommendations which guide XAI system designers to better cater to the specific needs of specialized users and promote the responsible use of AI tools in public service.2024EHElize Herrewijnen et al.Explainable AI (XAI)AI Ethics, Fairness & AccountabilityDIS
Effects of a Gaze-Based 2D Platform Game on User Enjoyment, Perceived Competence, and Digital Eye StrainGaze interaction is a promising interaction method to increase va- riety, challenge, and fun in games.We present “Shed Some Fear”, a 2D platform game including numerous eye-gaze-based interactions. \shedSomeFear includes control with eye-gaze and traditional keyboard input. The eye-gaze interactions are partially based on eye exercises reducing digital eye strain but also on employing peripheral vision. By employing eye-gaze as a necessary input mechanism, we explore the effects on and tradeoffs between user enjoyment and digital eye strain in a five-day longitudinal between-subject study (N=17) compared to interaction with a traditional mouse. We found that perceived competence was significantly higher with eye gaze interaction and significantly higher internal eye strain. With this work, we contribute to the not straightforward inclusion of eye tracking as a useful and fun input method for games.2024MCMark Colley et al.Ulm University, Cornell TechEye Tracking & Gaze InteractionGame UX & Player BehaviorCHI
“It doesn’t tell me anything about how my data is used”: User Perceptions of Data Collection PurposesData collection purposes and their descriptions are presented on almost all privacy notices under the GDPR, yet there is a lack of research focusing on how effective they are at informing users about data practices. We fill this gap by investigating users’ perceptions of data collection purposes and their descriptions, a crucial aspect of informed consent. We conducted 23 semi-structured interviews with European users to investigate user perceptions of six common purposes (Strictly Necessary, Statistics and Analytics, Performance and Functionality, Marketing and Advertising, Personalized Advertising, and Personalized Content) and identified elements of an effective purpose name and description. We found that most purpose descriptions do not contain the information users wish to know, and that participants preferred some purpose names over others due to their perceived transparency or ease of understanding. Based on these findings, we suggest how the framing of purposes can be improved toward meaningful informed consent.2024LKLin Kyi et al.Max Planck Institute for Security and PrivacyPrivacy by Design & User ControlPrivacy Perception & Decision-MakingCHI
Was it Real or Virtual? Confirming the Occurrence and Explaining Causes of Memory Source Confusion between Reality and Virtual RealitySource confusion occurs when individuals attribute a memory to the wrong source (e.g., confusing a picture with an experienced event). Virtual Reality (VR) represents a new source of memories particularly prone to being confused with reality. While previous research identified causes of source confusion between reality and other sources (e.g., imagination, pictures), there is currently no understanding of what characteristics specific to VR (e.g., immersion, presence) could influence source confusion. Through a laboratory study (n=29), we 1) confirm the existence of VR source confusion with current technology, and 2) present a quantitative and qualitative exploration of factors influencing VR source confusion. Building on the Source Monitoring Framework, we identify VR characteristics and assumptions about VR capabilities (e.g., poor rendering) that are used to distinguish virtual from real memories. From these insights, we reflect on how the increasing realism of VR could leave users vulnerable to memory errors and perceptual manipulations.2024EBElise Bonnail et al.Institut Polytechnique de ParisEye Tracking & Gaze InteractionImmersion & Presence ResearchCHI
An Eye Gaze Heatmap Analysis of Uncertainty Head-Up Display Designs for Conditional Automated DrivingThis paper reports results from a high-fidelity driving simulator study (N=215) about a head-up display (HUD) that conveys a conditional automated vehicle’s dynamic “uncertainty” about the current situation while fallback drivers watch entertaining videos. We compared (between-group) three design interventions: display (a bar visualisation of uncertainty close to the video), interruption (interrupting the video during uncertain situations), and combination (a combination of both), against a baseline (video-only). We visualised eye-tracking data to conduct a heatmap analysis of the four groups’ gaze behaviour over time. We found interruptions initiated a phase during which participants interleaved their attention between monitoring and entertainment. This improved monitoring behaviour was more pronounced in combination compared to interruption, suggesting pre-warning interruptions have positive effects. The same addition had negative effects without interruptions (comparing baseline & display). Intermittent interruptions may have safety benefits over placing additional peripheral displays without compromising usability.2024MGMichael A. Gerber et al.QUT, msg systems AGHead-Up Display (HUD) & Advanced Driver Assistance Systems (ADAS)Eye Tracking & Gaze InteractionCHI
Damage Optimization in Video Games: A Player-Driven Co-Creative ApproachThe concept of dealing damage is established and widespread in video games. With growing complexity and countless interactions in modern games, capturing how damage unfolds becomes an intricate problem - for developers just as for players. Misunderstanding how to optimize damage potentials includes risks of game imbalances, game-breaking exploits, mismatches between player skill and challenge (harming flow), and impaired perceived competence. All of these considerably impact player experience, game reception, success, and retention, yet polishing optimal strategies remains often a player community effort. To accelerate, inform and ease this process, we implemented an interactive tool capable of simulating, visualizing, planning and comparing damage strategies in video games. Following a case study within the Guild Wars 2 community, we contribute a player-driven perspective on the problem of damage optimization, as well as an artifact that resulted in empirical improvements – advancing the fields of game analytics, game evaluation methods and self-regulated learning.2024JPJohannes Pfau et al.Utrecht University, University of California Santa CruzGame UX & Player BehaviorGamification DesignMultiplayer & Social GamesCHI
Cheat Codes as External Support for Players Navigating Fear of Failure and Self-Regulation Challenges In Digital GamesFailure is an integral element of most games, and while some players may benefit from external support, such as cheat codes, to prompt self-soothing, most games lack supportive elements. We asked participants (N=88) to play Anno 1404 in single-player mode, and presented a money-generating cheat code in a challenging situation, also measuring the personality trait of action-state orientation, which explains differences in self-regulation ability (i.e., self-soothing) in response to threats of failure. Individuals higher in state orientation were more likely to take the offer, and used the cheat code more frequently. The cheat code also acted as an external support, as differences in experienced pressure between action- and state-oriented participants vanished when it was used. We found no negative consequences of using external support in intrinsic motivation, needs satisfaction, flow, or performance. We argue that external support mechanisms can help state-oriented players to self-regulate in gaming, when faced with failure.2024KWKarla Waldenmeier et al.University of TrierGame UX & Player BehaviorSerious & Functional GamesGamification DesignCHI
An Ontology of Dark Patterns Knowledge: Foundations, Definitions, and a Pathway for Shared Knowledge-BuildingDeceptive and coercive design practices are increasingly used by companies to extract profit, harvest data, and limit consumer choice. Dark patterns represent the most common contemporary amalgamation of these problematic practices, connecting designers, technologists, scholars, regulators, and legal professionals in transdisciplinary dialogue. However, a lack of universally accepted definitions across the academic, legislative, practitioner, and regulatory space has likely limited the impact that scholarship on dark patterns might have in supporting sanctions and evolved design practices. In this paper, we seek to support the development of a shared language of dark patterns, harmonizing ten existing regulatory and academic taxonomies of dark patterns and proposing a three-level ontology with standardized definitions for 64 synthesized dark pattern types across low-, meso-, and high-level patterns. We illustrate how this ontology can support translational research and regulatory action, including transdisciplinary pathways to extend our initial types through new empirical work across application and technology domains.2024CGColin M. Gray et al.Indiana UniversityAI Ethics, Fairness & AccountabilityDark Patterns RecognitionCHI
Toxicity in Online Games: The Prevalence and Efficacy of Coping StrategiesToxicity is pervasive in online multiplayer games, exposing players to disruptive and harmful behaviours. Players employ various approaches to cope with exposure to toxicity; however, game designers and researchers lack guidance on how to implement coping support within games. In this paper, we first conduct a formative study to collect a comprehensive list of coping approaches from toxicity literature and use affinity mapping to identify overarching game-based coping strategies. Then, we report findings from a survey (n = 85) on players’ experiences with toxicity, how they employ the identified coping strategies, how games support coping, and their general coping styles. Our paper contributes a framework for coping strategies to deal with game-based toxicity and provides insights into the prevalence of these strategies among players and factors that affect their usage and effectiveness. These findings can be used to guide better in-game tools that help players mitigate the harm caused by toxicity.2024JFJulian Frommel et al.Utrecht UniversityMultiplayer & Social GamesCyberbullying & Online HarassmentCHI
V-FRAMER: Visualization Framework for Mitigating Reasoning Errors in Public PolicyExisting data visualization design guidelines focus primarily on constructing grammatically-correct visualizations that faithfully convey the values and relationships in the underlying data. However, a designer may create a grammatically-correct visualization that still leaves audiences susceptible to reasoning misleaders, e.g. by failing to normalize data or using unrepresentative samples. Reasoning misleaders are especially pernicious when presenting public policy data, where data-driven decisions can affect public health, safety, and economic development. Through textual analysis, a formative evaluation, and iterative design with 19 policy communicators, we construct an actionable visualization design framework, V-FRAMER, that effectively synthesizes ways of mitigating reasoning misleaders. We discuss important design considerations for frameworks like V-FRAMER, including using concrete examples to help designers understand reasoning misleaders, and using a hierarchical structure to support example-based accessing. We further describe V-FRAMER's congruence with current practice and how practitioners might integrate the framework into their existing workflows. Related materials available at: https://osf.io/q3uta/.2024LGLily W. Ge et al.Northwestern UniversityExplainable AI (XAI)Uncertainty VisualizationCHI
The Effect of Spatial Audio on Curvature Gains in VR Redirected WalkingRedirected walking (RDW) is a technique that allows users to navigate larger physical spaces in virtual reality (VR) environments by manipulating the users' view of the virtual world. In this study, we investigate the effect of adding spatial audio elements to curvature gains in RDW aiming to increase the perceptual threshold for the manipulation and allowing for higher levels of unnoticed redirection. We conducted a user study (n = 18), evaluating perceptual thresholds across conditions with and without spatial audio elements across different curvature gains. We found that spatial audio can significantly increase thresholds with a large effect size. This finding indicates the value of spatial audio for RDW. It could facilitate higher levels of redirection, while maintaining a convincing experience, leading to more freedom to navigate virtual environments in even smaller physical spaces.2024MGMaarten Gerritse et al.Utrecht UniversityImmersion & Presence ResearchContext-Aware ComputingCHI