"I Need to Find That One Chart": How Data Workers Navigate, Summarize and Communicate Analytical ConversationsConversational interfaces are increasingly used for data analysis, enabling data workers to express complex analytical intents in natural language. Yet, these interactions unfold as long, linear transcripts that are misaligned with the iterative, nonlinear nature of real-world analyses. Revisiting and summarizing conversations for different contexts is therefore challenging. This paper investigates how data workers navigate, make sense of, and communicate prior analytical conversations. To study behaviors beyond those supported by standard interfaces (i.e., scrolling and keyword search), we develop a design probe that supplements analytical conversations with structured elements and affordances (e.g., filtering, multi-level navigation and detail-on-demand). In a user study ($n = 10$), participants used the probe to navigate and communicate past analyses, fulfilling information needs (recall, reorient, prioritize) through navigation strategies (visual recall, sequential and abstractive) and summarization practices (adding process details and context). Based on these findings, we discuss design implications to support re-visitation and communication of analytical conversations.2026KGKen Gu et al.University of WashingtonUser Research Methods (Interviews, Surveys, Observation)Prototyping & User TestingInteractive Data VisualizationCHI
Lexara: A User-Centered Toolkit for Evaluating Large Language Models for Conversational Visual AnalyticsLarge Language Models (LLMs) are transforming Conversational Visual Analytics (CVA) by enabling data analysis through natural language. However, evaluating LLMs for CVA remains a challenge: requiring programming expertise, overlooking real-world complexity, and lacking interpretable metrics for multi-format (visualizations and text) outputs. Through interviews with 22 CVA developers and 16 end-users, we identified use-cases, evaluation criteria and workflows. We present Lexara, a user-centered evaluation toolkit for CVA that operationalizes these insights into: (i) test-cases spanning real-world scenarios; (ii) interpretable metrics covering visualization quality (data fidelity, semantic alignment, functional correctness, design clarity) and language quality (factual grounding, analytical reasoning, conversational coherence) using rule-based and LLM-as-a-judge methods; and (iii) an interactive toolkit enabling experimental setup and multi-format and multi-level exploration of results without programming expertise. We conducted a two-week diary study with six CVA developers, drawn from our initial cohort of 22. Their feedback demonstrated Lexara's effectiveness for guiding appropriate model and prompt selection.2026SPSrishti Palani et al.Tableau ResearchHuman-LLM CollaborationInteractive Data VisualizationExplainable AI (XAI)CHI
Pluto: Authoring Semantically Aligned Text and Charts for Data-Driven CommunicationTextual content (including titles, annotations, and captions) plays a central role in helping readers understand a visualization by emphasizing, contextualizing, or summarizing the depicted data. Yet, existing visualization tools provide limited support for jointly authoring the two modalities of text and visuals such that both convey semantically-rich information and are cohesively integrated. In response, we introduce Pluto, a mixed-initiative authoring system that uses features of a chart's construction (e.g., visual encodings) as well as any textual descriptions a user may have drafted to make suggestions about the content and presentation of the two modalities. For instance, a user can begin to type out a description and interactively brush a region of interest in the chart, and Pluto will generate a relevant auto-completion of the sentence. Similarly, based on a written description, Pluto may suggest lifting a sentence out as an annotation or the visualization's title, or may suggest applying a data transformation (e.g., sort) to better align the two modalities. A preliminary user study revealed that Pluto's recommendations were particularly useful for bootstrapping the authoring process and helped identify different strategies participants adopt when jointly authoring text and charts. Based on study feedback, we discuss design implications for integrating interactive verification features between charts and text, offering control over text verbosity and tone, and enhancing the bidirectional flow in unified text and chart authoring tools.2025ASArjun Srinivasan et al.Interactive Data VisualizationData StorytellingIUI
Plume: Scaffolding Text Composition in DashboardsText in dashboards plays multiple critical roles, including providing context, offering insights, guiding interactions, and summarizing key information. Despite its importance, most dashboarding tools focus on visualizations and offer limited support for text authoring. To address this gap, we developed Plume, a system to help authors craft effective dashboard text. Through a formative review of exemplar dashboards, we created a typology of text parameters and articulated the relationship between visual placement and semantic connections, which informed Plume’s design. Plume employs large language models (LLMs) to generate contextually appropriate content and provides guidelines for writing clear, readable text. A preliminary evaluation with 12 dashboard authors explored how assisted text authoring integrates into workflows, revealing strengths and limitations of LLM-generated text and the value of our human-in-the-loop approach. Our findings suggest opportunities to improve dashboard authoring tools by better supporting the diverse roles that text plays in conveying insights.2025MLMaxim Lisnic et al.University of UtahHuman-LLM CollaborationData StorytellingCHI
AI-Enabled Conversational Journaling for Advancing Parkinson's Disease Symptom TrackingJournaling plays a crucial role in managing chronic conditions by allowing patients to document symptoms and medication intake, providing essential data for long-term care. While valuable, traditional journaling methods often rely on static, self-directed entries, lacking interactive feedback and real-time guidance. This gap can result in incomplete or imprecise information, limiting its usefulness for effective treatment. To address this gap, we introduce PATRIKA, an AI-enabled prototype designed specifically for people with Parkinson's disease (PwPD). The system incorporates cooperative conversation principles, clinical interview simulations, and personalization to create a more effective and user-friendly journaling experience. Through two user studies with PwPD and iterative refinement of PATRIKA, we demonstrate conversational journaling's significant potential in patient engagement and collecting clinically valuable information. Our results showed that generating probing questions PATRIKA turned journaling into a bi-directional interaction. Additionally, we offer insights for designing journaling systems for healthcare and future directions for promoting sustained journaling.2025MRMashrur Rashik et al.University of Massachusetts AmherstHuman-LLM CollaborationChronic Disease Self-Management (Diabetes, Hypertension, etc.)Diet Tracking & Nutrition ManagementCHI
Jupybara: Operationalizing a Design Space for Actionable Data Analysis and Storytelling with LLMsMining and conveying actionable insights from complex data is a key challenge of exploratory data analysis (EDA) and storytelling. To address this challenge, we present a design space for actionable EDA and storytelling. Synthesizing theory and expert interviews, we highlight how semantic precision, rhetorical persuasion, and pragmatic relevance underpin effective EDA and storytelling. We also show how this design space subsumes common challenges in actionable EDA and storytelling, such as identifying appropriate analytical strategies and leveraging relevant domain knowledge. Building on the potential of LLMs to generate coherent narratives with commonsense reasoning, we contribute Jupybara, an AI-enabled assistant for actionable EDA and storytelling implemented as a Jupyter Notebook extension. Jupybara employs two strategies—design-space-aware prompting and multi-agent architectures—to operationalize our design space. An expert evaluation confirms Jupybara’s usability, steerability, explainability, and reparability, as well as the effectiveness of our strategies in operationalizing the design space framework with LLMs.2025HWHuichen Will Wang et al.University of Washington, Paul G. Allen School of Computer Science & Engineering; Tableau ResearchHuman-LLM CollaborationInteractive Data VisualizationData StorytellingCHI
SlopeSeeker: A Search Tool for Exploring a Dataset of Quantifiable TrendsNatural language and search interfaces intuitively facilitate data exploration and provide visualization responses to diverse analytical queries based on the underlying datasets. However, these interfaces often fail to interpret more complex analytical intents, such as discerning subtleties and quantifiable differences between terms like “bump” and “spike” in the context of COVID cases, for example. We address this gap by extending the capabilities of a data exploration search interface for interpreting semantic concepts in time series trends. We first create a comprehensive dataset of semantic concepts by mapping quantifiable univariate data trends such as slope and angle to crowdsourced, semantically meaningful trend labels. The dataset contains quantifiable properties that capture the slope-scalar effect of semantic modifiers like “sharply” and “gradually”, as well as multi-line trends (e.g., “peak,” “valley”). We demonstrate the utility of this dataset in SlopeSeeker, a tool that supports natural language querying of quantifiable trends, such as “show me stocks that tanked in 2010.” The tool incorporates novel scoring and ranking techniques based on semantic relevance and visual prominence to present relevant trend chart responses containing these semantic trend concepts. In addition, SlopeSeeker provides a faceted search interface for users to navigate a semantic hierarchy of concepts from general trends (e.g., “increase”) to more specific ones (e.g., “sharp increase”). A preliminary user evaluation of the tool demonstrates that the search interface supports greater expressivity of queries containing concepts that describe data trends. We identify potential future directions for leveraging our publicly available quantitative semantics dataset in other data domains and for novel visual analytics interfaces.2024ABAlexander Bendeck et al.Time-Series & Network Graph VisualizationData StorytellingVisualization Perception & CognitionIUI
Olio: A Semantic Search Interface for Data RepositoriesSearch and information retrieval systems are becoming more expressive in interpreting user queries beyond the traditional weighted bag-of-words model of document retrieval. For example, searching for a flight status or a game score returns a dynamically generated response along with supporting, pre-authored documents contextually relevant to the query. In this paper, we extend this hybrid search paradigm to data repositories that contain curated data sources and visualization content. We introduce a semantic search interface, OLIO, that provides a hybrid set of results comprising both auto-generated visualization responses and pre-authored charts to blend analytical question-answering with content discovery search goals. We specifically explore three search scenarios - question-and-answering, exploratory search, and design search over data repositories. The interface also provides faceted search support for users to refine and filter the conventional best-first search results based on parameters such as author name, time, and chart type. A preliminary user evaluation of the system demonstrates that OLIO's interface and the hybrid search paradigm collectively afford greater expressivity in how users discover insights and visualization content in data repositories.2023VSVidya Setlur et al.Interactive Data VisualizationData StorytellingUIST
How do you Converse with an Analytical Chatbot? Revisiting Gricean Maxims for Designing Analytical Conversational BehaviorChatbots have garnered interest as conversational interfaces for a variety of tasks. While general design guidelines exist for chatbot interfaces, little work explores analytical chatbots that support conversing with data. We explore Gricean Maxims to help inform the basic design of effective conversational interaction. We also draw inspiration from natural language interfaces for data exploration to support ambiguity and intent handling. We ran Wizard of Oz studies with 30 participants to evaluate user expectations for text and voice chatbot design variants. Results identified preferences for intent interpretation and revealed variations in user expectations based on the interface affordances. We subsequently conducted an exploratory analysis of three analytical chatbot systems (text + chart, voice + chart, voice-only) that implement these preferred design variants. Empirical evidence from a second 30-participant study informs implications specific to data-driven conversation such as interpreting intent, data orientation, and establishing trust through appropriate system responses.2022VSVidya Setlur et al.Tableau ResearchConversational ChatbotsHuman-LLM CollaborationVisualization Perception & CognitionCHI
Snowy: Recommending Utterances for Conversational Visual AnalysisNatural language interfaces (NLIs) have become a prevalent medium for conducting visual data analysis, enabling people with varying levels of analytic experience to ask questions of and interact with their data. While there have been notable improvements with respect to language understanding capabilities in these systems, fundamental user experience and interaction challenges including the lack of analytic guidance (i.e., knowing what aspects of the data to consider) and discoverability of natural language input (i.e., knowing how to phrase input utterances) persist. To address these challenges, we investigate utterance recommendations that contextually provide analytic guidance by suggesting data features (e.g., attributes, values, trends) while implicitly making users aware of the types of phrasings that an NLI supports. We present SNOWY, a prototype system that generates and recommends utterances for visual analysis based on a combination of data interestingness metrics and language pragmatics. Through a preliminary user study, we found that utterance recommendations in SNOWY support conversational visual analysis by guiding the participants' analytic workflows and making them aware of the system's language interpretation capabilities. Based on the feedback and observations from the study, we discuss potential implications and considerations for incorporating recommendations in future NLIs for visual analysis.2021ASArjun Srinivasan et al.Human-LLM CollaborationInteractive Data VisualizationUIST
Towards Understanding How Readers Integrate Charts and Captions: A Case Study with Line ChartsCharts often contain visually prominent features that draw attention to aspects of the data and include text captions that emphasize aspects of the data. Through a crowdsourced study, we explore how readers gather takeaways when considering charts and captions together. We first ask participants to mark visually prominent regions in a set of line charts. We then generate text captions based on the prominent features and ask participants to report their takeaways after observing chart-caption pairs. We find that when both the chart and caption describe a high-prominence feature, readers treat the doubly emphasized high-prominence feature as the takeaway; when the caption describes a low-prominence chart feature, readers rely on the chart and report a higher-prominence feature as the takeaway. We also find that external information that provides context, helps further convey the caption’s message to the reader. We use these findings to provide guidelines for authoring effective chart-caption pairs.2021DKDae Hyun Kim et al.Stanford University, Tableau ResearchInteractive Data VisualizationVisualization Perception & CognitionCHI
Sneak Pique: Exploring Autocompletion as a Data Discovery Scaffold for Supporting Visual AnalysisNatural language interaction has evolved as a useful modality to help users explore and interact with their data during visual analysis. Little work has been done to explore how autocompletion can help with data discovery while helping users formulate analytical questions. We developed a system called Sneak Pique as a design probe to better understand the usefulness of autocompletion for visual analysis. We ran three Mechanical Turk studies to evaluate user preferences for various text- and visualization widget-based autocompletion design variants for helping with partial search queries. Our findings indicate that users found data previews to be useful in the suggestions. Widgets were preferred for previewing temporal, geospatial, and numerical data while text autocompletion was preferred for categorical and hierarchical data. We conducted an exploratory analysis of our system implementing this specific subset of preferred autocompletion variants. Our insights regarding the efficacy of these autocompletion suggestions can inform the future design of natural language interfaces supporting visual analysis.2020VSVidya Setlur et al.Interactive Data VisualizationVisualization Perception & CognitionUIST
Inferencing Underspecified Natural Language Utterances in Visual AnalysisHandling ambiguity and underspecification of users’ utterances is challenging, particularly for natural language inter- faces that help with visual analytical tasks. Constraints in the underlying analytical platform and the users’ expectations of high precision and recall require thoughtful inferencing to help generate useful responses. In this paper, we introduce a system to resolve partial utterances based on syntactic and semantic constraints of the underlying analytical expressions. We extend inferencing based on best practices in information visualization to generate useful visualization responses. We employ heuristics to help constrain the solution space of possible inferences, and apply ranking logic to the interpretations based on relevancy. We evaluate the quality of inferred interpretations based on relevancy and analytical usefulness.2019VSVidya Setlur et al.Explainable AI (XAI)Interactive Data VisualizationIUI