Sculpin: Direct-Manipulation Transformation of JSONMany end-user programming tasks require programmatically processing JSON, wrangling it from one format to another or building interactive applications atop it. But end-users are impeded by the indirectness and steep learning curve of textual code. We present Sculpin, a direct-manipulation environment supporting a broad range of JSON-transformation tasks. A user of Sculpin transforms JSON data step by step, recording a program in the process. Sculpin makes three design commitments to ensure directness and versatility: (1) steps are small and precise, not inferred; (2) steps are general-purpose and open to re-appropriation; (3) steps operate on JSON itself, rather than on a limited intermediate representation. To support these commitments, Sculpin introduces a mechanism of sculptable selections: the user can direct their action by guiding a selection on top of the data through small steps like generalization and hierarchical navigation. Sculpin also extends JSON with embedded interface elements like form inputs and buttons, allowing applications to be sculpted incrementally from source data. We demonstrate the breadth and directness of Sculpin in use-cases ranging from wrangling data to building applications. We evaluate Sculpin through a heuristic analysis, situating it in a broad space of programming systems and surfacing limitations such as difficulties editing preexisting programs.2025JHJoshua Horowitz et al.Knowledge Worker Tools & WorkflowsPrototyping & User TestingUIST
From Pen to Prompt: How Creative Writers Integrate AI into their Writing PracticeCreative writing is a deeply human craft, yet AI systems using large language models (LLMs) offer the automation of significant parts of the writing process. So why do some creative writers choose to use AI? Through interviews and observed writing sessions with 18 creative writers who already use AI regularly in their writing practice, we find that creative writers are intentional about how they incorporate AI, making many deliberate decisions about when and how to engage AI based on their core values, such as authenticity and craftsmanship. We characterize the interplay between writers' values, their fluid relationships with AI, and specific integration strategies---ultimately enabling writers to create new AI workflows without compromising their creative values. We provide insight for writing communities, AI developers and future researchers on the importance of supporting transparency of these emerging writing processes and rethinking what AI features can best serve writers.2025AGAlicia Guo et al.Human-LLM CollaborationAI-Assisted Creative WritingC&C
rTisane: Externalizing conceptual models for data analysis increases engagement with domain knowledge and improves statistical model qualityStatistical models should accurately reflect analysts’ domain knowledge about variables and their relationships. While recent tools let analysts express these assumptions and use them to produce a resulting statistical model, it remains unclear what analysts want to express and how externalization impacts statistical model quality. This paper addresses these gaps. We first conduct an exploratory study of analysts using a domain-specific language (DSL) to express conceptual models. We observe a preference for detailing how variables relate and a desire to allow, and then later resolve, ambiguity in their conceptual models. We leverage these findings to develop rTisane, a DSL for expressing conceptual models augmented with an interactive disambiguation process. In a controlled evaluation, we find that analysts reconsidered their assumptions, self-reported externalizing their assumptions accurately, and maintained analysis intent with rTisane. Additionally, rTisane enabled some analysts to author statistical models they were unable to specify manually. For others, rTisane resulted in models that better fit the data or enabled iterative improvement.2024EJEunice Jun et al.University of California, Los AngelesVisualization Perception & CognitionComputational Methods in HCICHI
Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooMData analysts have long sought to turn unstructured text data into meaningful concepts. Though common, topic modeling and clustering focus on lower-level keywords and require significant interpretative work. We introduce concept induction, a computational process that instead produces high-level concepts, defined by explicit inclusion criteria, from unstructured text. For a dataset of toxic online comments, where a state-of-the-art BERTopic model outputs “women, power, female,” concept induction produces high-level concepts such as “Criticism of traditional gender roles” and “Dismissal of women's concerns.” We present LLooM, a concept induction algorithm that leverages large language models to iteratively synthesize sampled text and propose human-interpretable concepts of increasing generality. We then instantiate LLooM in a mixed-initiative text analysis tool, enabling analysts to shift their attention from interpreting topics to engaging in theory-driven analysis. Through technical evaluations and four analysis scenarios ranging from literature review to content moderation, we find that LLooM’s concepts improve upon the prior art of topic models in terms of quality and data coverage. In expert case studies, LLooM helped researchers to uncover new insights even from familiar datasets, for example by suggesting a previously unnoticed concept of attacks on out-party stances in a political social media dataset.2024MLMichelle S. Lam et al.Stanford UniversityHuman-LLM CollaborationTechnology Ethics & Critical HCIComputational Methods in HCICHI
How Do Data Analysts Respond to AI Assistance? A Wizard-of-Oz StudyData analysis is challenging as analysts must navigate nuanced decisions that may yield divergent conclusions. AI assistants have the potential to support analysts in planning their analyses, enabling more robust decision making. Though AI-based assistants that target code execution (e.g., Github Copilot) have received significant attention, limited research addresses assistance for both analysis execution and planning. In this work, we characterize helpful planning suggestions and their impacts on analysts’ workflows. We first review the analysis planning literature and crowd-sourced analysis studies to categorize suggestion content. We then conduct a Wizard-of-Oz study (n=13) to observe analysts’ preferences and reactions to planning assistance in a realistic scenario. Our findings highlight subtleties in contextual factors that impact suggestion helpfulness, emphasizing design implications for supporting different abstractions of assistance, forms of initiative, increased engagement, and alignment of goals between analysts and assistants.2024KGKen Gu et al.Paul G. Allen School of Computer Science & Engineering, University of WashingtonHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationCHI
Live, Rich, and Composable Programming with EngraftLive & rich tools can support a diversity of domain-specific programming tasks, from visualization authoring to data wrangling. Real-world programming, however, requires performing multiple tasks in concert, calling for the use of multiple tools alongside conventional code. Programmers lack environments capable of composing live & rich tools to support these situations. To enable this composition, we contribute Engraft, a component-based API that allows live & rich tools to be embedded within larger environments like computational notebooks. Through recursive embedding of components, Engraft enables several new forms of composition: not only embedding tools inside environments, but also embedding environments within each other and embedding tools and environments in the outside world, including conventional codebases. We demonstrate Engraft with examples from diverse domains, including web-application development, command-line scripting, and physics education. By providing composability, Engraft can help cultivate a cycle of use and innovation in live & rich programming.2023JHJoshua Horowitz et al.Prototyping & User TestingComputational Methods in HCIUIST
Living Papers: A Language Toolkit for Augmented Scholarly CommunicationComputing technology has deeply shaped how academic articles are written and produced, yet article formats and affordances have changed little over centuries. The status quo consists of digital files optimized for printed paper—ill-suited to interactive reading aids, accessibility, dynamic figures, or easy information extraction and reuse. Guided by formative discussions with scholarly communication researchers and publishing tool developers, we present Living Papers, a language toolkit for producing augmented academic articles that span print, interactive, and computational media. Living Papers articles may include formatted text, references, executable code, and interactive components. Articles are parsed into a standardized document format from which a variety of outputs are generated, including static PDFs, dynamic web pages, and extraction APIs for paper content and metadata. We describe Living Papers' architecture, document model, and reactive runtime, and detail key aspects such as citation processing and conversion of interactive components to static content. We demonstrate the use and extension of Living Papers through examples spanning traditional research papers, explorable explanations, information extraction, and reading aids such as enhanced citations, cross-references, and equations. Living Papers is available as an extensible, open source platform intended to support both article authors and researchers of augmented reading and writing experiences.2023JHJeffrey Heer et al.Generative AI (Text, Image, Music, Video)Interactive Data VisualizationComputational Methods in HCIUIST
Visualizing Urban Accessibility: Investigating Multi-Stakeholder Perspectives through a Map-based Design Probe StudyUrban accessibility assessments are challenging: they involve varied stakeholders across decision-making contexts while serving a diverse population of people with disabilities. To better support urban accessibility assessment using data visualizations, we conducted a three-part interview study with 25 participants across five stakeholder groups using map visualization probes. We present a multi-stakeholder analysis of visualization needs and sensemaking processes to explore how interactive visualizations can support stakeholder decision making. In particular, we elaborate how stakeholders’ varying levels of familiarity with accessibility, geospatial analysis, and specific geographic locations influences their sensemaking needs. We then contribute 10 design considerations for geovisual analytic tools for urban accessibility communication, planning, policymaking, and advocacy.2022MSManaswi Saha et al.University of WashingtonGeospatial & Map VisualizationSmart Cities & Urban SensingInclusive DesignCHI
Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data RelationshipsProper statistical modeling incorporates domain theory about how concepts relate and details of how data were measured. However, data analysts currently lack tool support for recording and reasoning about domain assumptions, data collection, and modeling choices in an integrated manner, leading to mistakes that can compromise scientific validity. For instance, generalized linear mixed-effects models (GLMMs) help answer complex research questions, but omitting random effects impairs the generalizability of results. To address this need, we present Tisane, a mixed-initiative system for authoring generalized linear models with and without mixed-effects. Tisane introduces a study design specification language for expressing and asking questions about relationships between variables. Tisane contributes an interactive compilation process that represents relationships in a graph, infers candidate statistical models, and asks follow-up questions to disambiguate user queries to construct a valid model. In case studies with three researchers, we find that Tisane helps them focus on their goals and assumptions while avoiding past mistakes.2022EJEunice Jun et al.University of WashingtonExplainable AI (XAI)Computational Methods in HCICHI
Idyll Studio: A Structured Editor for Authoring Interactive & Data-Driven ArticlesInteractive articles are an effective medium of communication in education, journalism, and scientific publishing, yet are created using complex general-purpose programming tools. We present \textit{Idyll Studio}, a structured editor for authoring and publishing interactive and data-driven articles. We extend the Idyll framework to support reflective documents, which can inspect and modify their underlying program at runtime, and show how this functionality can be used to reify the constituent parts of a reactive document model---components, text, state, and styles---in an expressive, interoperable, and easy-to-learn graphical interface. In a study with 18 diverse participants, all could perform basic editing and composition, use datasets and variables, and specify relationships between components. Most could choreograph interactive visualizations and dynamic text, although some struggled with advanced uses requiring unstructured code editing. Our findings suggest \textit{Idyll Studio} lowers the threshold for non-experts to create interactive articles and allows experts to rapidly specify a wide range of article designs.2021MCMatthew Conlen et al.Interactive Data VisualizationData StorytellingUIST
Urban Accessibility as a Socio-Political Problem: A Multi-Stakeholder AnalysisTraditionally, urban accessibility is defined as the ease of reaching destinations. Studies on urban accessibility for pedestrians with mobility disabilities (e.g., wheelchair users) have primarily focused on understanding the challenges that the built environment imposes and how they overcome them. In this paper, we move beyond physical barriers and focus on socio-political challenges in the civic ecosystem that impedes accessible infrastructure development. Using a multi-stakeholder approach, we interviewed five primary stakeholder groups (N=25): (1) people with mobility disabilities, (2) caregivers, (3) accessibility advocates, (4) department officials, and (5) policymakers. We discussed their current accessibility assessment and decision-making practices. We identified the key needs and desires of each group, how they differed, and how they interacted with each other in the civic ecosystem to bring about change. We found that people, politics, and money were intrinsically tied to underfunded accessibility improvement projects—without continued support from the public and the political leadership, existing funding may also disappear. Using the insights from these interviews, we explore how may technology enhance our stakeholders’ decision-making processes and facilitate accessible infrastructure development.2020MSManaswi Saha et al.Accessibility / Women's EmpowermentCSCW
Paths Explored, Paths Omitted, Paths Obscured: Decision Points & Selective Reporting in End-to-End Data AnalysisDrawing reliable inferences from data involves many, sometimes arbitrary, decisions across phases of data collection, wrangling, and modeling. As different choices can lead to diverging conclusions, understanding how researchers make analytic decisions is important for supporting robust and replicable analysis. In this study, we pore over nine published research studies and conduct semi-structured interviews with their authors. We observe that researchers often base their decisions on methodological or theoretical concerns, but subject to constraints arising from the data, expertise, or perceived interpretability. We confirm that researchers may experiment with choices in search of desirable results, but also identify other reasons why researchers explore alternatives yet omit findings. In concert with our interviews, we also contribute visualizations for communicating decision processes throughout an analysis. Based on our results, we identify design opportunities for strengthening end-to-end analysis, for instance via tracking and meta-analysis of multiple decision paths.2020YLYang Liu et al.University of WashingtonExplainable AI (XAI)Algorithmic Transparency & AuditabilityCHI
Dziban: Balancing Agency & Automation in Visualization Design via Anchored RecommendationsVisualization recommender systems attempt to automate design decisions spanning choices of selected data, transformations, and visual encodings. However, across invocations such recommenders may lack the context of prior results, producing unstable outputs that override earlier design choices. To better balance automated suggestions with user intent, we contribute Dziban, a visualization API that supports both ambiguous specification and a novel anchoring mechanism for conveying desired context. Dziban uses the Draco knowledge base to automatically complete partial specifications and suggest appropriate visualizations. In addition, it extends Draco with chart similarity logic, enabling recommendations that also remain perceptually similar to a provided "anchor" chart. Existing APIs for exploratory visualization, such as ggplot2 and Vega-Lite, require fully specified chart definitions. In contrast, Dziban provides a more concise and flexible authoring experience through automated design, while preserving predictability and control through anchored recommendations.2020HLHalden Lin et al.University of WashingtonRecommender System UXInteractive Data VisualizationCHI
Falcon: Balancing Interactive Latency and Resolution Sensitivity for Scalable Linked VisualizationsWe contribute user-centered prefetching and indexing methods that provide low-latency interactions across linked visualizations, enabling cold-start exploration of billion-record datasets. We implement our methods in Falcon, a web-based system that makes principled trade-offs between latency and resolution to optimize brushing and view switching times. To optimize latency-sensitive brushing actions, Falcon reindexes data upon changes to the active view a user is brushing in. To limit view switching times, Falcon initially loads reduced interactive resolutions, then progressively improves them. Benchmarks show that Falcon sustains real-time interactivity of 50fps for pixel-level brushing and linking across multiple visualizations with no costly precomputation. We show constant brushing performance regardless of data size on datasets ranging from millions of records in the browser to billions when connected to a backing database system.2019DMDominik Moritz et al.University of WashingtonTime-Series & Network Graph VisualizationVisualization Perception & CognitionCHI
Augmenting Code with In Situ Visualizations to Aid Program UnderstandingProgrammers must draw explicit connections between their code and runtime state to properly assess the correctness of their programs. However, debugging tools often decouple the program state from the source code and require explicitly invoked views to bridge the rift between program editing and program understanding. To unobtrusively reveal runtime behavior during both normal execution and debugging, we contribute techniques for visualizing program variables directly within the source code. We describe a design space and placement criteria for embedded visualizations. We evaluate our in situ visualizations in an editor for the Vega visualization grammar. Compared to a baseline development environment, novice Vega users improve their overall task grade by about 2 points when using the in situ visualizations and exhibit significant positive effects on their self-reported speed and accuracy.2018JHJane Hoffswell et al.University of WashingtonInteractive Data VisualizationKnowledge Worker Tools & WorkflowsCHI
Idyll: A Markup Language for Authoring and Publishing Interactive Articles on the WebThe web has matured as a publishing platform: news outlets regularly publish rich, interactive stories while technical writers use animation and interaction to communicate complex ideas. This style of interactive media has the potential to engage a large audience and more clearly explain concepts, but is expensive and time consuming to produce. Drawing on industry experience and interviews with domain experts, we contribute design tools to make it easier to author and publish interactive articles. We introduce Idyll, a novel "compile-to-the-web" language for web-based interactive narratives. Idyll implements a flexible article model, allowing authors control over document style and layout, reader-driven events (such as button clicks and scroll triggers), and a structured interface to JavaScript components. Through both examples and first-use results from undergraduate computer science students, we show how Idyll reduces the amount of effort and custom code required to create interactive articles.2018MCMatthew Conlen et al.Interactive Data VisualizationData StorytellingUIST
Value-Suppressing Uncertainty PalettesUnderstanding uncertainty is critical for many analytical tasks. One common approach is to encode data values and uncertainty values independently, using two visual variables. These resulting bivariate maps can be difficult to interpret, and interference between visual channels can reduce the discriminability of marks. To address this issue, we contribute Value-Suppressing Uncertainty Palettes (VSUPs). VSUPs allocate larger ranges of a visual channel to data when uncertainty is low, and smaller ranges when uncertainty is high. This non-uniform budgeting of the visual channels makes more economical use of the limited visual encoding space when uncertainty is low, and encourages more cautious decision-making when uncertainty is high. We demonstrate several examples of VSUPs, and present a crowdsourced evaluation showing that, compared to traditional bivariate maps, VSUPs encourage people to more heavily weight uncertainty information in decision-making tasks.2018MCMichael Correll et al.Tableau ResearchUncertainty VisualizationVisualization Perception & CognitionCHI
Somewhere Over the Rainbow: An Empirical Assessment of Quantitative ColormapsAn essential goal of quantitative color encoding is the accurate mapping of perceptual dimensions of color to the logical structure of data. Prior research identifies weaknesses of ''rainbow'' colormaps and advocates for ramping in luminance, while recent work contributes multi-hue colormaps generated using perceptually-uniform color models. We contribute a comparative analysis of different colormap types, with a focus on comparing single- and multi-hue schemes. We present a suite of experiments in which subjects perform relative distance judgments among color triplets drawn systematically from each of four single-hue and five multi-hue colormaps. We characterize speed and accuracy across each colormap, and identify conditions that degrade performance. We also find that a combination of perceptual color space and color naming measures more accurately predict user performance than either alone, though the overall accuracy is poor. Based on these results, we distill recommendations on how to design more effective color encodings for scalar data.2018YLYang Liu et al.the University of WashingtonVisualization Perception & CognitionCHI