Prototyping with Prompts: Emerging Approaches and Challenges in Generative AI Design for Collaborative Software TeamsGenerative AI models are increasingly being integrated into human task workflows, enabling the production of expressive content across a wide range of contexts. Unlike traditional human-AI design methods, the new approach to designing generative capabilities focuses heavily on prompt engineering strategies. This shift requires a deeper understanding of how collaborative software teams establish and apply design guidelines, iteratively prototype prompts, and evaluate them to achieve specific outcomes. To explore these dynamics, we conducted design studies with 39 industry professionals, including UX designers, AI engineers, and product managers. Our findings highlight emerging practices and role shifts in AI system prototyping among multistakeholder teams. We observe various prompting and prototyping strategies, highlighting the pivotal role of to-be-generated content characteristics in enabling rapid, iterative prototyping with generative AI. By identifying associated challenges, such as the limited model interpretability and overfitting the design to specific example content, we outline considerations for generative AI prototyping.2025HSHariharan Subramonyam et al.Stanford UniversityGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationPrototyping & User TestingCHI
PlanTogether: Facilitating AI Application Planning Using Information Graphs and Large Language ModelsIn client-AI expert collaborations, the planning stage of AI application development begins from the client; a client outlines their needs and expectations while assessing available resources (pre-collaboration planning). Despite the importance of pre-collaboration plans for discussions with AI experts for iteration and development, the client often fails to reflect their needs and expectations into a concrete actionable plan. To facilitate pre-collaboration planning, we introduce PlanTogether, a system that generates tailored client support using large language models and a Planning Information Graph, whose nodes and edges represent information in the plan and the information dependencies. Using the graph, the system links and presents information that guides client's reasoning; it provides tips and suggestions based on relevant information and displays an overview to help understand the progression through the plan. A user study validates the effectiveness of PlanTogether in helping clients navigate information dependencies and write actionable plans reflecting their domain expertise.2025DKDae Hyun Kim et al.Yonsei University, Department of Computer Science and Engineering; KAIST, Information & Electronics Research InstituteHuman-LLM CollaborationData StorytellingCHI
Script&Shift: A Layered Interface Paradigm for Integrating Content Development and Rhetorical Strategy with LLM Writing AssistantsGood writing is a dynamic process of knowledge transformation, where writers refine and evolve ideas through planning, translating, and reviewing. Generative AI-powered writing tools can enhance this process but may also disrupt the natural flow of writing, such as when using LLMs for complex tasks like restructuring content across different sections or creating smooth transitions. We introduce Script&Shift, a layered interface paradigm designed to minimize these disruptions by aligning writing intents with LLM capabilities to support diverse content development and rhetorical strategies. By bridging envisioning, semantic, and articulatory distances, Script&Shift interactions allow writers to leverage LLMs for various content development tasks (scripting) and experiment with diverse organization strategies while tailoring their writing for different audiences (shifting). This approach preserves creative control while encouraging divergent and iterative writing. Our evaluation shows that Script&Shift enables writers to creatively and efficiently incorporate LLMs while preserving a natural flow of composition.2025MSMomin Naushad Siddiqui et al.Georgia Institute of TechnologyGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationAI-Assisted Decision-Making & AutomationCHI
Promoting Comprehension and Engagement in Introductory Data and Statistics for Blind and Low-Vision Students: A Co-Design StudyStatistical literacy involves understanding, interpreting, and critically evaluating statistical information in a contextually grounded way. Current instructional practices rely heavily on visual techniques, which renders them inaccessible to students who are blind or have low vision (BLV). To bridge this gap, we formed an extended co-design partnership with a statistics teacher, a teacher for students with visual impairments (TVI), and two BLV students to develop accessibility-first practices for building statistical literacy. Through several months of collaboration that included discussion, exploration, design, and evaluation, we identified specific approaches to promote comprehension and engagement. The enactive approaches we designed, using scaffolding and timely feedback, fostered insights through pattern recognition and analogical reasoning. Additionally, inquiry-based methods promoted contextually situated reasoning and reflection on how statistics can improve students' lives and communities. We present these findings alongside participants’ experiences and discuss their implications for inclusive learning frameworks and tools.2025DFDanyang Fan et al.Stanford University, Mechanical EngineeringVisual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)Special Education TechnologyCHI
Is a Seat at the Table Enough? Engaging Teachers and Students in Dataset Specification for ML in EducationDespite the promises of ML in education, its adoption in the classroom has surfaced numerous issues regarding fairness, accountability, and transparency, as well as concerns about data privacy and student consent. A root cause of these issues is the lack of understanding of the complex dynamics of education, including teacher-student interactions, collaborative learning, and classroom environment. To overcome these challenges and fully utilize the potential of ML in education, software practitioners need to work closely with educators and students to fully understand the context of the data (the backbone of ML applications) and collaboratively define the ML data specifications. To gain a deeper understanding of such a collaborative process, we conduct ten co-design sessions with ML software practitioners, educators, and students. In the sessions, teachers and students work with ML engineers, UX designers, and legal practitioners to define dataset characteristics for a given ML application. We find that stakeholders contextualize data based on their domain and procedural knowledge, proactively design data requirements to mitigate downstream harms and data reliability concerns, and exhibit role-based collaborative strategies and contribution patterns. Further, we find that beyond a seat at the table, meaningful stakeholder participation in ML requires structured supports: defined processes for continuous iteration and co-evaluation, shared contextual data quality standards, and information scaffolds for both technical and non-technical stakeholders to traverse expertise boundaries.2024MTMei Tan et al.Session 2d: Interaction, Engagement, and Support in Educational EnvironmentsCSCW
AINeedsPlanner: A Workbook to Support Effective Collaboration Between AI Experts and ClientsClients often partner with AI experts to develop AI applications tailored to their needs. In these partnerships, careful planning and clear communication are critical, as inaccurate or incomplete specifications can result in misaligned model characteristics, expensive reworks, and potential friction between collaborators. Unfortunately, given the complexity of requirements ranging from functionality, data, and governance, effective guidelines for collaborative specification of requirements in client-AI expert collaborations are missing. In this work, we introduce AINeedsPlanner, a workbook that AI experts and clients can use to facilitate effective interchange of clear specifications. The workbook is based on (1) an interview of 10 completed AI application project teams, which identifies and characterizes steps in AI application planning and (2) a study with 12 AI experts, which defines a taxonomy of AI experts’ information needs and dimensions that affect the information needs. Finally, we demonstrate the workbook’s utility with two case studies in real-world settings.2024DKDae Hyun Kim et al.Human-LLM CollaborationParticipatory DesignDIS
Design Space of Visual Feedforward And Corrective Feedback in XR-Based Motion Guidance SystemsExtended reality (XR) technologies are highly suited in assisting individuals in learning motor skills and movements---referred to as motion guidance. In motion guidance, the ``feedforward’’ provides instructional cues of the motions that are to be performed, whereas the ``feedback’’ provides cues which help correct mistakes and minimize errors. Designing synergistic feedforward and feedback is vital to providing an effective learning experience, but this interplay between the two has not yet been adequately explored. Based on a survey of the literature, we propose design space for both motion feedforward and corrective feedback in XR, and describe the interaction effects between them. We identify common design approaches of XR-based motion guidance found in our literature corpus, and discuss them through the lens of our design dimensions. We then discuss additional contextual factors and considerations that influence this design, together with future research opportunities for motion guidance in XR.2024XYHariharan Subramonyam et al.University of Stuttgart, University of StuttgartFull-Body Interaction & Embodied InputVR Medical Training & RehabilitationInteractive Narrative & Immersive StorytellingCHI
Bridging the Gulf of Envisioning: Cognitive Challenges in Prompt Based Interactions with LLMsLarge language models (LLMs) exhibit dynamic capabilities and appear to comprehend complex and ambiguous natural language prompts. However, calibrating LLM interactions is challenging for interface designers and end-users alike. A central issue is our limited grasp of how human cognitive processes begin with a goal and form intentions for executing actions, a blindspot even in established interaction models such as Norman's gulfs of execution and evaluation. To address this gap, we theorize how end-users `envision' translating their goals into clear intentions and craft prompts to obtain the desired LLM response. We define a process of \textit{Envisioning} by highlighting three misalignments on not knowing: (1) what the task should be, (2) how to instruct the LLM to do the task, and (3) what to expect for the LLM’s output in meeting the goal. Finally, we make recommendations to narrow the envisioning gulf in human-LLM interactions.2024HSHariharan Subramonyam et al.Stanford UniversityHuman-LLM CollaborationUser Research Methods (Interviews, Surveys, Observation)CHI
More than Model Documentation: Uncovering Teachers' Bespoke Information Needs for Informed Classroom Integration of ChatGPTChatGPT has entered classrooms, circumventing typical training and vetting procedures. Unlike other educational technologies, it placed teachers in direct contact with the versatility of generative AI. Consequently, teachers are urgently tasked to assess its capabilities to inform their use of ChatGPT. However, it is unclear what support teachers have and need and whether existing documentation, such as model cards, provides adequate direction for educators in this new paradigm. By interviewing 22 middle- and high-school ELA and Social Studies teachers, we connect the discourse on AI transparency and documentation with educational technology integration, highlighting the information needs of teachers. Our findings reveal that teachers confront significant information gaps, lacking clarity on exploring ChatGPT's capabilities for bespoke learning tasks and ensuring its fit with the needs of diverse learners. As a solution, we propose a framework for interactive model documentation that empowers teachers to navigate the interplay between pedagogical and technical knowledge.2024MTMei Tan et al.Stanford UniversityGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationK-12 Digital Education ToolsCHI
Why and When LLM-Based Assistants Can Go Wrong: Investigating the Effectiveness of Prompt-Based Interactions for Software Help-SeekingLarge Language Model (LLM) assistants, such as ChatGPT, have emerged as potential alternatives to search methods for helping users navigate complex, feature-rich software. LLMs use vast training data from domain-specific texts, software manuals, and code repositories to mimic human-like interactions, offering tailored assistance, including step-by-step instructions. In this work, we investigated LLM-generated software guidance through a within-subject experiment with 16 participants and follow-up interviews. We compared a baseline LLM assistant with an LLM optimized for particular software contexts, SoftAIBot, which also offered guidelines for constructing appropriate prompts. We assessed task completion, perceived accuracy, relevance, and trust. Surprisingly, although SoftAIBot outperformed the baseline LLM, our results revealed no significant difference in LLM usage and user perceptions with or without prompt guidelines and the integration of domain context. Most users struggled to understand how the prompt's text related to the LLM's responses and often followed the LLM's suggestions verbatim, even if they were incorrect. This resulted in difficulties when using the LLM's advice for software tasks, leading to low task completion rates. Our detailed analysis also revealed that users remained unaware of inaccuracies in the LLM's responses, indicating a gap between their lack of software expertise and their ability to evaluate the LLM's assistance. With the growing push for designing domain-specific LLM assistants, we emphasize the importance of incorporating explainable, context-aware cues into LLMs to help users understand prompt-based interactions, identify biases, and maximize the utility of LLM assistants.2024AKAnjali Khurana et al.Human-LLM CollaborationExplainable AI (XAI)IUI
Spellburst: a node-based interface for exploratory creative coding with natural language prompts Creative coding tasks are often exploratory in nature. When producing digital artwork, artists usually begin with a high-level semantic construct such as a “stained glass filter” and programmatically implement it by varying code parameters such as shape, color, lines, and opacity to produce visually appealing results. Based on interviews with artists, it can be effortful to translate semantic constructs to program syntax, and current programming tools don’t lend well to rapid creative exploration. To address these challenges, we introduce Spellburst, a large language model (LLM) powered creative-coding environment. Spellburst provides (1) a node-based interface that allows artists to create generative art and explore variations through branching and merging operations, (2) expressive prompt-based interactions to engage in semantic programming, and (3) dynamic prompt-driven interfaces and direct code editing to seamlessly switch between semantic and syntactic exploration. Our evaluation with artists demonstrates Spellburst’s potential to enhance creative coding practices and inform the design of computational creativity tools that bridge semantic and syntactic spaces.2023TATyler Angert et al.Generative AI (Text, Image, Music, Video)Creative Coding & Computational ArtUIST
fAIlureNotes: Supporting Designers in Understanding the Limits of AI Models for Computer Vision TasksTo design with AI models, user experience (UX) designers must assess the fit between the model and user needs. Based on user research, they need to contextualize the model's behavior and potential failures within their product-specific data instances and user scenarios. However, our formative interviews with ten UX professionals revealed that such a proactive discovery of model limitations is challenging and time-intensive. Furthermore, designers often lack technical knowledge of AI and accessible exploration tools, which challenges their understanding of model capabilities and limitations. In this work, we introduced a \textit{failure-driven design} approach to AI, a workflow that encourages designers to explore model behavior and failure patterns early in the design process. The implementation of \system, a designer-centered failure exploration and analysis tool, supports designers in evaluating models and identifying failures across diverse user groups and scenarios. Our evaluation with UX practitioners shows that \system outperforms today's interactive model cards in assessing context-specific model performance.2023SMSteven Moore et al.Technical University Munich (TUM)Explainable AI (XAI)AI-Assisted Decision-Making & AutomationKnowledge Worker Tools & WorkflowsCHI
Designerly Understanding: Information Needs for Model Transparency to Support Design Ideation for AI-Powered User ExperienceDespite the widespread use of artificial intelligence (AI), designing user experiences (UX) for AI-powered systems remains challenging. UX designers face hurdles understanding AI technologies, such as pre-trained language models, as design materials. This limits their ability to ideate and make decisions about whether, where, and how to use AI. To address this problem, we bridge the literature on AI design and AI transparency to explore whether and how frameworks for transparent model reporting can support design ideation with pre-trained models. By interviewing 23 UX practitioners, we find that practitioners frequently work with pre-trained models, but lack support for UX-led ideation. Through a scenario-based design task, we identify common goals that designers seek model understanding for and pinpoint their model transparency information needs. Our study highlights the pivotal role that UX designers can play in Responsible AI and calls for supporting their understanding of AI limitations through model transparency and interrogation.2023QLQ. Vera Liao et al.Microsoft ResearchHuman-LLM CollaborationExplainable AI (XAI)AI-Assisted Decision-Making & AutomationCHI
VideoSticker: A Tool for Active Viewing and Visual Note-taking from VideosVideo is an effective medium for knowledge communication and learning. Yet active viewing and note-taking from videos remain a challenge. Specifically, during note-taking, viewers find it difficult to extract essential information such as representation, composition, motion, and interactions of graphical objects and narration. Current approaches rely on creating static screenshots, manual clipping, and manual annotation/transcription. Additionally, note-takers may need to repeatedly pause and rewind the video, disrupting their active viewing process. We propose VideoSticker, a tool designed to support visual note-taking by extracting expressive content from videos as 'motion stickers'. VideoSticker implements automated object detection and tracking, linking objects to the transcript, and rapid extraction of stickers across space, time, and events of interest. VideoSticker's two-pass approach allows viewers to capture high-level information uninterrupted and later extract specific details. We demonstrate the usability of VideoSticker for a variety of videos and note-taking needs.2022YCYining Cao et al.Recommender System UXData StorytellingIUI
Solving Separation-of-Concerns Problems in Collaborative Design of Human-AI Systems through Leaky AbstractionsIn conventional software development, user experience (UX) designers and engineers collaborate through separation of concerns (SoC): designers create human interface specifications, and engineers build to those specifications. However, we argue that Human-AI systems thwart SoC because human needs must shape the design of the AI interface, the underlying AI sub-components, and training data. How do designers and engineers currently collaborate on AI and UX design? To find out, we interviewed 21 industry professionals (UX researchers, AI engineers, data scientists, and managers) across 14 organizations about their collaborative work practices and associated challenges. We find that hidden information encapsulated by SoC challenges collaboration across design and engineering concerns. Practitioners describe inventing ad-hoc representations exposing low-level design and implementation details (which we characterize as leaky abstractions) to "puncture" SoC and share information across expertise boundaries. We identify how leaky abstractions are employed to collaborate at the AI-UX boundary and formalize a process of creating and using leaky abstractions.2022HSHariharan Subramonyam et al.Stanford UniversityHuman-LLM CollaborationExplainable AI (XAI)AI-Assisted Decision-Making & AutomationCHI