Answering Developer Questions with Annotated Agent-Discovered Program TracesDevelopers often find themselves asking questions that cut across a code base. Answering these questions requires gathering relevant facts and tracing flow through the program. Yet today’s tools offer limited support for answering these questions. Developers can either use imprecise AI tools that ignore flow or flow-tracing tools that impose a great number of choices. In this paper, we introduce a new kind of tool that answers questions better by bringing together elements of both AI and flow. We instantiate this idea in Trailblazer, a system underpinned by an AI agent that simulates an information forager, iteratively tracing program dependencies in search of answers. Then, Trailblazer packages information it found into an answer digest, which includes interactive, annotated traces of exploration. These traces can be stepped through to help developers orient to the code and find where the answer is distributed within it. In a lab study, Trailblazer helped participants answer questions more efficiently and gain greater familiarity with program flow than an AI question answering baseline. This shows how AI agents can leverage program flow to bring additional structure and clarity to its answers.2025LYLitao Yan et al.Identity & Avatars in XRHuman-LLM CollaborationComputational Methods in HCIUIST
Simulating Cooperative Prosocial Behavior with Multi-Agent LLMs: Evidence and Mechanisms for AI Agents to Inform Policy DecisionsHuman prosocial cooperation is essential for our collective health, education, and welfare. However, designing social systems to maintain or incentivize prosocial behavior is challenging because people can act selfishly to maximize personal gain. This complex and unpredictable aspect of human behavior makes it difficult for policymakers to foresee the implications of their designs. Recently, multi-agent LLM systems have shown remarkable capabilities in simulating human-like behavior, and replicating some human lab experiments. This paper studies how well multi-agent systems can simulate prosocial human behavior, such as that seen in the public goods game (PGG), and whether multi-agent systems can exhibit “unbounded actions” seen outside the lab in real world scenarios. We find that multi-agent LLM systems successfully replicate human behavior from lab experiments of the public goods game with three experimental treatments - priming, transparency, and varying endowments. Beyond replicating existing experiments, we find that multi-agent LLM systems can replicate the expected human behavior when combining experimental treatments, even if no previous study combined those specific treatments. Lastly, we find that multi-agent systems can exhibit a rich set of unbounded actions that people do in the real world outside of the lab – such as collaborating and even cheating. In sum, these studies are steps towards a future where LLMs can be used to inform policy decisions that encourage people to act in a prosocial manner.2025KSKarthik Sreedhar et al.Human-LLM CollaborationAI-Assisted Decision-Making & AutomationAlgorithmic Fairness & BiasIUI
Not Just Novelty: A Longitudinal Study on Utility and Customization of an AI WorkflowGenerative AI brings novel and impressive abilities to help people in everyday tasks. There are many AI workflows that solve real and complex problems by chaining AI outputs together with human interaction. Although there is an undeniable lure of AI, it is uncertain how useful generative AI workflows are after the novelty wears off. Additionally, workflows built with generative AI have the potential to be easily customized to fit users' individual needs, but do users take advantage of this? We conducted a three-week longitudinal study with 12 users to understand the familiarization and customization of generative AI tools for science communication. Our study revealed that there exists a familiarization phase, during which users were exploring the novel capabilities of the workflow and discovering which aspects they found useful. After this phase, users understood the workflow and were able to anticipate the outputs. Surprisingly, after familiarization the perceived utility of the system was rated higher than before, indicating that the perceived utility of AI is not just a novelty effect. The increase in benefits mainly comes from end-users' ability to customize prompts, and thus potentially appropriate the system to their own needs. This points to a future where generative AI systems can allow us to design for appropriation.2024TLTao Long et al.Generative AI (Text, Image, Music, Video)Human-LLM CollaborationSustainable HCIDIS
PodReels: Human-AI Co-Creation of Video Podcast TeasersVideo podcast teasers are short videos that can be shared on social media platforms to capture interest in full episodes of a video podcast. These teasers enable long-form podcasters to reach new audiences and gain more followers. However, creating a compelling teaser from an hour-long episode can be challenging. Selecting interesting clips requires significant mental effort; editing the chosen clips into a cohesive, well-produced teaser is time-consuming. To support the creation of video podcast teasers, we first investigated what makes a good teaser. We combined insights from audience comments and creator interviews to identify key ingredients. We also identified a common workflow used by creators during this process. Based on these findings, we developed a human-AI co-creative tool called PodReels to assist video podcasters in crafting teasers. Our user study demonstrated that PodReels significantly reduces creators' mental demand and improves their efficiency in producing video podcast teasers.2024SWSitong Wang et al.Video Production & EditingCreative Collaboration & Feedback SystemsDIS
ReelFramer: Human-AI Co-Creation for News-to-Video TranslationShort videos on social media are the dominant way young people consume content. News outlets aim to reach audiences through news reels---short videos conveying news---but struggle to translate traditional journalistic formats into short, entertaining videos. To translate news into social media reels, we support journalists in reframing the narrative. In literature, narrative framing is a high-level structure that shapes the overall presentation of a story. We identified three narrative framings for reels that adapt social media norms but preserve news value, each with a different balance of information and entertainment. We introduce ReelFramer, a human-AI co-creative system that helps journalists translate print articles into scripts and storyboards. ReelFramer supports exploring multiple narrative framings to find one appropriate to the story. AI suggests foundational narrative details, including characters, plot, setting, and key information. ReelFramer also supports visual framing; AI suggests character and visual detail designs before generating a full storyboard. Our studies show that narrative framing introduces the necessary diversity to translate various articles into reels, and establishing foundational details helps generate scripts that are more relevant and coherent. We also discuss the benefits of using narrative framing and foundational details in content retargeting.2024SWSitong Wang et al.Columbia UniversityAI-Assisted Creative WritingVideo Production & EditingCHI
Writing out the Storm: Designing and Evaluating Tools for Weather Risk MessagingCommunicating risk to the public in the lead-up to and during severe weather events has the potential to reduce the impacts of these events on lives and property. Globally, these events are anticipated to increase due to climate change, rendering effective risk communication an integral component of climate adaptation policies. Research in risk communications literature has developed substantial knowledge and best practices for the design of risk messaging. This study considers the potential for quantifying the compliance of severe weather risk messages with these best practices, individually and at scale, and developing tools to improve risk communication messaging. The current work makes two contributions. First, we develop a string-matching approach to evaluate whether messaging complies with best practices and suggest areas for improvement. Second, we conduct an interview study with risk communication professionals to inform the design space of authoring tools and other technologies to support severe weather risk communicators.2024SJSophia S Jit et al.University of TorontoContext-Aware ComputingClimate Change Communication ToolsCHI
PopBlends: Strategies for Conceptual Blending with Large Language ModelsPop culture is an important aspect of communication. On social media people often post pop culture reference images that connect an event, product or other entity to a pop culture domain. Creating these images is a creative challenge that requires finding a conceptual connection between the users' topic and a pop culture domain. In cognitive theory, this task is called conceptual blending. We present a system called PopBlends that automatically suggests conceptual blends. The system explores three approaches that involve both traditional knowledge extraction methods and large language models. Our annotation study shows that all three methods provide connections with similar accuracy, but with very different characteristics. Our user study shows that people found twice as many blend suggestions as they did without the system, and with half the mental demand. We discuss the advantages of combining large language models with knowledge bases for supporting divergent and convergent thinking.2023SWSitong Wang et al.Columbia UniversityGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationCHI
OPAL: Multimodal Image Generation for News IllustrationsAdvances in multimodal AI have presented people with powerful ways to create images from text. Recent work has shown that text-to-image generations are able to represent a broad range of subjects and artistic styles. However, finding the right visual language for text prompts is difficult. In this paper, we address this challenge with Opal, a system that produces text-to-image generations for news illustration. Given an article, Opal guides users through a structured search for visual concepts and provides a pipeline allowing users to generate illustrations based on an article's tone, keywords, and related artistic styles. Our evaluation shows that Opal efficiently generates diverse sets of news illustrations, visual assets, and concept ideas. Users with Opal generated two times more usable results than users without. We discuss how structured exploration can help users better understand the capabilities of human AI co-creative systems.2022VLVivian Liu et al.Generative AI (Text, Image, Music, Video)AI-Assisted Creative WritingUIST
Improving Subject Representation in AI Generated Art: Design Guidelines for Using Image Prompts with Text-to-Image Generative ModelsAdvances in text-to-image generative models have made it easier for people to create art by just prompting models with text. However, creating through text leaves users with limited control over the final composition or the way the subject is represented. A potential solution is to use image prompts alongside text prompts to condition the model. To better understand how and when image prompts can improve subject representation in generations, we conduct an annotation experiment to quantify their effect on generations of abstract, concrete plural, and concrete singular subjects. We find that initial images improved subject representation across all subject types, with the most noticeable improvement in concrete singular subjects. In an analysis of different types of initial images, we find that icons and photos produced high quality generations of different aesthetics. We conclude with design guidelines for how initial images can improve subject representation in AI art.2022HQHan Qiao et al.Generative AI (Text, Image, Music, Video)AI-Assisted Creative WritingGraphic Design & Typography ToolsC&C
SymbolFinder: Brainstorming Diverse Symbols Using Local Semantic NetworksVisual symbols are the building blocks for visual communication. They convey abstract concepts like reform and participation quickly and effectively. When creating graphics with symbols, novice designers often struggle to brainstorm multiple, diverse symbols because they fixate on a few associations instead of broadly exploring different aspects of the concept. We present SymbolFinder, an interactive tool for finding visual symbols for abstract concepts. SymbolFinder molds symbol-finding into a recognition rather than recall task by introducing the user to diverse clusters of words associated with the concept. Users can dive into these clusters to find related, concrete objects that symbolize the concept. We evaluate SymbolFinder with two studies: a comparative user study, demonstrating that SymbolFinder helps novices find more unique symbols for abstract concepts with significantly less effort than a popular image database and a case study demonstrating how SymbolFinder helped design students create visual metaphors for three cover illustrations of news articles.2021SPSavvas Petridis et al.Data StorytellingGraphic Design & Typography ToolsUIST
VisiFit: Structuring Iterative Improvement for Novice DesignersVisual blends are an advanced graphic design technique to seamlessly integrate two objects into one. Existing tools help novices create prototypes of blends, but it is unclear how they would improve them to be higher fidelity. To help novices, we aim to add structure to the iterative improvement process. We introduce a method for improving prototypes that uses secondary design dimensions to explore a structured design space. This method is grounded in the cognitive principles of human visual object recognition. We present VisiFit – a computational design system that uses this method to enable novice graphic designers to improve blends with computationally generated options they can select, adjust, and chain together. Our evaluation shows novices can substantially improve 76% of blends in under 4 minutes. We discuss how the method can be generalized to other blending problems, and how computational tools can support novices by enabling them to explore a structured design space quickly and efficiently.2021LCLydia B. Chilton et al.Columbia UniversityGraphic Design & Typography ToolsCreative Collaboration & Feedback SystemsCHI
Cicero: Multi-Turn, Contextual Argumentation for Accurate CrowdsourcingTraditional approaches for ensuring high quality crowdwork have failed to achieve high-accuracy on difficult problems. Aggregating redundant answers often fails on the hardest problems when the majority is confused. Argumentation has been shown to be effective in mitigating these drawbacks. However, existing argumentation systems only support limited interactions and show workers general justifications, not context-specific arguments targeted to their reasoning. This paper presents Cicero, a new workflow that improves crowd accuracy on difficult tasks by engaging workers in multi-turn, contextual discussions through real-time, synchronous argumentation. Our experiments show that compared to previous argumentation systems which only improve the average individual worker accuracy by 6.8 percentage points on the Relation Extraction domain, our workflow achieves 16.7 percentage point improvement. Furthermore, previous argumentation approaches don't apply to tasks with many possible answers; in contrast, Cicero works well in these cases, raising accuracy from 66.7% to 98.8% on the Codenames domain.2019QCQuanze Chen et al.University of WashingtonHuman-LLM CollaborationCrowdsourcing Task Design & Quality ControlCHI
VisiBlends: A Flexible Workflow for Visual BlendsVisual blends are an advanced graphic design technique to draw attention to a message. They combine two objects in a way that is novel and useful in conveying a message symbolically. This paper presents VisiBlends, a flexible workflow for creating visual blends that follows the iterative design process. We introduce a design pattern for blending symbols based on principles of human visual object recognition. Our workflow decomposes the process into both computational techniques and human microtasks. It allows users to collaboratively generate visual blends with steps involving brainstorming, synthesis, and iteration. An evaluation of the workflow shows that decentralized groups can generate blends in independent microtasks, co-located groups can collaboratively make visual blends for their own messages, and VisiBlends improves novices' ability to make visual blends.2019LCLydia B. Chilton et al.Columbia UniversityGraphic Design & Typography ToolsCreative Collaboration & Feedback SystemsCHI