StepMIND: A Visual Framework for Stepwise, Multimodal, and Bidirectional Explanations of AI-Generated Data Analysis PipelineArtificial intelligence (AI) enables users to generate data visualizations from natural language descriptions, lowering the barrier to data exploration. However, AI-generated visualizations often present only the final output, lacking transparency and limiting users' ability to verify, interpret, or refine the results. To address this, we introduce \stepmindnospace, a generalizable visual framework that enhances explainability and interactivity in AI-generated data analysis pipelines. \stepmind integrates four dimensions: (1) Stepwise Refinement, allowing users to engage in the AI decision process; (2) Multimodal Explanations, combining natural language, structured notation, direct manipulation, and content visualization for accessible interpretation; (3) Bidirectional Editing, enabling seamless updates across modalities; and (4) Familiar Interaction Models, such as code editor and spreadsheet-based manipulations, to support both technical and non-technical users. To demonstrate its utility, we apply \stepmind in \stagenospace, a case study system for AI-assisted data visualization. A within-subject user study (N=20) shows that \stage significantly improves user confidence and trust, reduces cognitive load, and facilitates both exploratory and corrective refinements. Our findings further suggest that \stepmind can generalize to broader AI-assisted workflows, offering a visible and interactive approach to explainable AI.2026YWYang Wu et al.ETH ZurichExplainable AI (XAI)Interactive Data VisualizationAI-Assisted Decision-Making & AutomationIUI
NetworkCanvas: Supporting Progressive Network Visualization Exploration via Adaptive RecommendationsNetwork visualization has become essential for understanding complex relationships across domains, yet network complexity creates an overwhelming exploration space where users frequently miss critical patterns. Existing tools often require predetermined analysis goals and manual workflow construction, limiting accessibility for non-experts. We present NetworkCanvas, a progressive network visualization system that guides users through personalized exploration via adaptive recommendations. Our approach combines a learning mechanism that adapts to user feedback, an analytic state graph preserving exploration provenance with branching paths, and a context-aware feedback interpreter that suggests analytical continuations based on selection patterns. Controlled studies demonstrate that NetworkCanvas users identified more noteworthy observations, reported higher confidence, and exhibited more systematic exploration compared to a baseline without recommendations. These results demonstrate that recommendation-guided exploration improves outcomes over unguided manual analysis; however, because our baseline lacked recommendation functionality entirely, the specific contribution of adaptive personalization versus static guidance remains an open question. Qualitative findings suggest that recommendations reduce analysis paralysis and support systematic exploration.2026WLWenchao Li et al.HUAWEI TECHNOLOGIES CO., LTD.Interactive Data VisualizationExploratory Search & Information SeekingKnowledge Graph & Semantic SearchCHI
Dark Patterns Meet GUI Agents: LLM Agent Susceptibility to Manipulative Interfaces and the Role of Human OversightThe dark patterns, deceptive interface designs manipulating user behaviors, have been extensively studied for their effects on human decision-making and autonomy. Yet, with the rising prominence of LLM-powered GUI agents that automate tasks from high-level intents, understanding how dark patterns affect agents is increasingly important. We present a two-phase empirical study examining how agents, human participants, and human-AI teams respond to 16 types of dark patterns across diverse scenarios. Phase 1 highlights that agents often fail to recognize dark patterns, and even when aware, prioritize task completion over protective action. Phase 2 revealed divergent failure modes: humans succumb due to cognitive shortcuts and habitual compliance, while agents falter from procedural blind spots. Human oversight improved avoidance but introduced costs such as attentional tunneling and cognitive load. Our findings show neither humans nor agents are uniformly resilient, and collaboration introduces new vulnerabilities, suggesting design needs for transparency, adjustable autonomy, and oversight.2026JTJingyu Tang et al.University of Notre DameDark Patterns RecognitionHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationCHI
Dango: A Mixed-Initiative Data Wrangling System using Large Language ModelData wrangling is a time-consuming and challenging task in the early stages of a data science pipeline. However, existing tools often fail to effectively interpret user intent. We propose Dango, a mixed-initiative multi-agent system that helps users generate data wrangling scripts. Compared to existing tools, Dango enhances user communication of intent by: (1) allowing users to demonstrate on multiple tables and use natural language prompts in a conversation interface, (2) enabling users to clarify their intent by answering LLM-posed multiple-choice clarification questions, and (3) providing multiple forms of feedback such as step-by-step NL explanations and data provenance to help users evaluate the data wrangling scripts. In a within-subjects, think-aloud study (n=38), the results show that Dango's features can significantly improve intent clarification, accuracy, and efficiency in data wrangling tasks.2025WCWei-Hao Chen et al.Purdue University, Computer ScienceHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationInteractive Data VisualizationCHI
Vision-Based Multimodal Interfaces: A Survey and Taxonomy for Enhanced Context-Aware System DesignThe recent surge in artificial intelligence, particularly in multimodal processing technology, has advanced human-computer interaction, by altering how intelligent systems perceive, understand, and respond to contextual information (i.e., context awareness). Despite such advancements, there is a significant gap in comprehensive reviews examining these advances, especially from a multimodal data perspective, which is crucial for refining system design. This paper addresses a key aspect of this gap by conducting a systematic survey of data modality-driven Vision-based Multimodal Interfaces (VMIs). VMIs are essential for integrating multimodal data, enabling more precise interpretation of user intentions and complex interactions across physical and digital environments. Unlike previous task- or scenario-driven surveys, this study highlights the critical role of the visual modality in processing contextual information and facilitating multimodal interaction. Adopting a design framework moving from the whole to the details and back, it classifies VMIs across dimensions, providing insights for developing effective, context-aware systems.2025YHYongquan 'Owen' Hu et al.University of New South WalesContext-Aware ComputingUbiquitous ComputingCHI
Reenvisioning Patient Education with Smart Hospital Patient RoomsDawson 等人提出智能医院病房中的患者教育新系统,通过交互界面和实时数据展示提升患者健康素养与治疗依从性,改善医疗服务体验。2024JDJoshua Dawson et al.Intelligent Tutoring Systems & Learning AnalyticsMental Health Apps & Online Support CommunitiesTelemedicine & Remote Patient MonitoringUbiComp
Echo: Reverberation-based Fast Black-Box Adversarial Attacks on Intelligent Audio Systems"Intelligent audio systems are ubiquitous in our lives, such as speech command recognition and speaker recognition. However, it is shown that deep learning-based intelligent audio systems are vulnerable to adversarial attacks. In this paper, we propose a physical adversarial attack that exploits reverberation, a natural indoor acoustic effect, to realize imperceptible, fast, and targeted black-box attacks. Unlike existing attacks that constrain the magnitude of adversarial perturbations within a fixed radius, we generate reverberation-alike perturbations that blend naturally with the original voice sample 1. Additionally, we can generate more robust adversarial examples even under over-the-air propagation by considering distortions in the physical environment. Extensive experiments are conducted using two popular intelligent audio systems in various situations, such as different room sizes, distance, and ambient noises. The results show that Echo can invade into intelligent audio systems in both digital and physical over-the-air environment." https://doi.org/10.1145/36108742023MXMeng Xue et al.Privacy by Design & User ControlUbiComp
Phrase-Gesture Typing on SmartphonesWe study phrase-gesture typing, a gesture typing method that allows users to type short phrases by swiping through all the letters of the words in a phrase using a single, continuous gesture. Unlike word-gesture typing, where text needs to be entered word by word, phrase-gesture typing enters text phrase by phrase. To demonstrate the usability of phrase-gesture typing, we implemented a prototype called PhraseSwipe. Our system is composed of a frontend interface designed specifically for typing through phrases and a backend phrase-level gesture decoder developed based on a transformer-based neural language model. Our decoder was trained using five million phrases of varying lengths of up to five words, chosen randomly from the Yelp Review Dataset. Through a user study with 12 participants, we demonstrate that participants could type using PhraseSwipe at an average speed of 34.5 WPM with a Word Error Rate of 1.1%.2022ZXZheer Xu et al.Voice User Interface (VUI) DesignGenerative AI (Text, Image, Music, Video)UIST