CoSight: Exploring Viewer Contributions to Online Video Accessibility Through Descriptive CommentingThe rapid growth of online video content has outpaced efforts to make visual information accessible to blind and low vision (BLV) audiences. While professional Audio Description (AD) remains the gold standard, it is costly and difficult to scale across the vast volume of online media. In this work, we explore a complementary approach to broaden participation in video accessibility: engaging everyday video viewers at their watching and commenting time. We introduce CoSight, a Chrome extension that augments YouTube with lightweight, in-situ nudges to support descriptive commenting. Drawing from Fogg’s Behavior Model, CoSight provides visual indicators of accessibility gaps, pop-up hints for what to describe, reminders to clarify vague comments, and related captions and comments as references. In an exploratory study with 48 sighted users, CoSight helped integrate accessibility contribution into natural viewing and commenting practices, resulting in 89% of comments including grounded visual descriptions. Follow-up interviews with four BLV viewers and four professional AD writers suggest that while such comments do not match the rigor of professional AD, they can offer complementary value by conveying visual context and emotional nuance for understanding the videos.2025RWRuolin Wang et al.Visual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)Augmentative & Alternative Communication (AAC)Universal & Inclusive DesignUIST
The GenUI Study: Exploring the Design of Generative UI Tools to Support UX Practitioners and BeyondAI can now generate high-fidelity UI mock-up screens from a high-level textual description, promising to support UX practitioners' work. However, it remains unclear how UX practitioners would adopt such Generative UI (GenUI) models in a way that is integral and beneficial to their work. To answer this question, we conducted a formative study with 37 UX-related professionals that consisted of four roles: UX designers, UX researchers, software engineers, and product managers. Using a state-of-the-art GenUI tool, each participant went through a week-long, individual mini-project exercise with role-specific tasks, keeping a daily journal of their usage and experiences with GenUI, followed by a semi-structured interview. We report findings on participants' workflow using the GenUI tool, how GenUI can support all and each specific roles, and existing gaps between GenUI and users' needs and expectations, which lead to design implications to inform future work on GenUI development.2025XCXiang 'Anthony' Chen et al.Generative AI (Text, Image, Music, Video)Human-LLM CollaborationAI-Assisted Creative WritingDIS
From Text to Pixels: Enhancing User Understanding through Text-to-Image Model ExplanationsRecent progress in Text-to-Image (T2I) models promises transformative applications in art, design, education, medicine, and entertainment. These models, exemplified by Dall-e, Imagen, and Stable Diffusion, have the potential to revolutionize various industries. However, a primary concern is their operation as a 'black-box' for many users. Without understanding the underlying mechanics, users are unable to harness the full potential of these models. This study focuses on bridging this gap by developing and evaluating explanation techniques for T2I models, targeting inexperienced end users. While prior works have delved into Explainable AI (XAI) methods for classification or regression tasks, T2I generation poses distinct challenges. Through formative studies with experts, we identified unique explanation goals and subsequently designed tailored explanation strategies. We then empirically evaluated these methods with a cohort of 473 participants from Amazon Mechanical Turk (AMT) across three tasks. Our results highlight users' ability to learn new keywords through explanations, a preference for example-based explanations, and challenges in comprehending explanations that significantly shift the image's theme. Moreover, findings suggest users benefit from a limited set of concurrent explanations. Our main contributions include a curated dataset for evaluating T2I explainability techniques, insights from a comprehensive AMT user study, and observations critical for future T2I model explainability research.2024NENoyan Evirgen et al.Generative AI (Text, Image, Music, Video)Explainable AI (XAI)IUI
XCreation: A Graph-Based Crossmodal Generative Creativity Support ToolCreativity Support Tools (CSTs) aid in the efficient and effective composition of creative content, such as picture books. However, many existing CSTs only allow for mono-modal creation, whereas previous studies have become theoretically and technically mature to support multi-modal innovative creations. To overcome this limitation, we introduce XCreation, a novel CST that leverages generative AI to support cross-modal storybook creation. Nevertheless, directly deploying AI models to CSTs can still be problematic as they are mostly black-box architectures that are not comprehensible to human users. Therefore, we integrate an interpretable entity-relation graph to intuitively represent picture elements and their relations, improving the usability of the underlying generative structures. Our between-subject user study demonstrates that XCreation supports continuous plot creation with increased creativity, controllability, usability, and interpretability. XCreation is applicable to various scenarios, including interactive storytelling and picture book creation, thanks to its multimodal nature.2023ZYZihan Yan et al.Generative AI (Text, Image, Music, Video)Human-LLM CollaborationExplainable AI (XAI)UIST
NaCanva: Exploring and Enabling the Nature-Inspired Creativity for ChildrenNature has been a bountiful source of materials, replenishment, inspiration, and creativity. Nature collage, as a crafting technique, offers children a fun and educational way to explore nature and express their creativity. However, the collection of raw material has been limited to static objects like leaves, ignoring inspiration from nature’s sounds and dynamic elements such as babbling creeks. To address this limitation, we have developed a mobile application with the aim of encouraging children’s creativity through renewed material collection and careful observation in nature. To explore the possibility of this approach, we conducted a formative study with children (N=20) and a design workshop with experts (N=6). With the results of these studies, we formulate NaCanva, an AI-assisted multi-modal collage creation system for children. Drawing upon the interactive relationship between children and nature, NaCanva facillitates a multi-modal material collection, including images, sound, and videos, which differs our system from traditional collages. We validated this system with a between-subject user study (N =30), and the results indicated that NaCanva enhances children’s multidimensional observation and engagement with nature, thereby unleashing their creativity in the creation of nature collages.2023ZYZihan Yan et al.Generative AI (Text, Image, Music, Video)Digital Art Installations & Interactive PerformanceFood Culture & Food InteractionMobileHCI
Designing and Evaluating Interfaces that Highlight News Coverage Diversity Using Discord QuestionsModern news aggregators do the hard work of organizing a large news stream, creating collections for a given news story with tens of source options. This paper shows that navigating large source collections for a news story can be challenging without further guidance. In this work, we design three interfaces -- the Annotated Article, the Recomposed Article, and the Question Grid -- aimed at accompanying news readers in discovering coverage diversity while they read. A first usability study with 10 journalism experts confirms the designed interfaces all reveal coverage diversity and determine each interface's potential use cases and audiences. In a second usability study, we developed and implemented a reading exercise with 95 novice news readers to measure exposure to coverage diversity. Results show that Annotated Article users are able to answer questions 34% more completely than with two existing interfaces while finding the interface equally easy to use.2023PLPhilippe Laban et al.Salesforce ResearchMisinformation & Fact-CheckingUser Research Methods (Interviews, Surveys, Observation)CHI
CrossA11y: Identifying Video Accessibility Issues via Cross-modal GroundingAuthors make their videos visually accessible by adding audio descriptions (AD), and auditorily accessible by adding closed captions (CC). However, creating AD and CC is challenging and tedious, especially for non-professional describers and captioners, due to the difficulty of identifying accessibility problems in videos. A video author will have to watch the video through and manually check for inaccessible information frame-by-frame, for both visual and auditory modalities. In this paper, we present CrossA11y, a system that helps authors efficiently detect and address visual and auditory accessibility issues in videos. Using cross-modal grounding analysis, CrossA11y automatically measures accessibility of visual and audio segments in a video by checking for modality asymmetries. CrossA11y then displays these segments and surfaces visual and audio accessibility issues in a unified interface, making it intuitive to locate, review, script AD/CC in-place, and preview the described and captioned video immediately. We demonstrate the effectiveness of CrossA11y through a lab study with 11 participants, comparing to existing baseline.2022XLXingyu "Bruce" Liu et al.Visual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)Deaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Universal & Inclusive DesignUIST
GANzilla: User-Driven Direction Discovery in Generative Adversarial NetworksGenerative Adversarial Network (GAN) is being widely adopted in numerous application areas, such as data preprocessing, image editing, and creativity support. However, GAN's 'black box' nature prevents non-expert users from controlling what data a model generates, spawning a plethora of prior work that focused on algorithm-driven approaches to automatically extract editing directions to control GAN. Complementarily, we propose a GANzilla---a user-driven tool that empowers a user with the classic scatter/gather technique to iteratively discover directions to meet their editing intents. In a work session with 12 participants, GANzilla users were able to discover directions that (i) edited images to match provided examples (closed-ended tasks) and that (ii) met a high-level goal, e.g., making the face happier, while showing diversity across individuals (open-ended tasks).2022NENoyan Evirgen et al.Generative AI (Text, Image, Music, Video)Human-LLM CollaborationUIST
Mobiot: Augmenting Everyday Objects into Moving IoT Devices Using 3D Printed Attachments Generated by DemonstrationRecent advancements in personal fabrication have brought novices closer to a reality, where they can automate routine tasks with mobilized everyday objects. However, the overall process remains challenging- from capturing design requirements and motion planning to authoring them to creating 3D models of mechanical parts to programming electronics, as it demands expertise. We introduce Mobiot, an end-user toolkit to help non-experts capture the design and motion requirements of legacy objects by demonstration. It then automatically generates 3D printable attachments, programs to operate assembled modules, a list of off-the-shelf electronics, and assembly tutorials. The authoring feature further assists users to fine-tune as well as to reuse existing motion libraries and 3D printed mechanisms to adapt to other real-world objects with different motions. We validate Mobiot through application examples with 8 everyday objects with various motions applied, and through technical evaluation to measure the accuracy of motion reconstruction.2022AAAbul Al Arabi et al.Texas A&M UniversityDesktop 3D Printing & Personal FabricationCircuit Making & Hardware PrototypingCustomizable & Personalized ObjectsCHI
Roman: Making Everyday Objects Robotically Manipulable with 3D-Printable Add-on MechanismsOne important vision of robotics is to provide physical assistance by manipulating different everyday objects, e.g., hand tools, kitchen utensils. However, many objects designed for dexterous hand-control are not easily manipulable by a single robotic arm with a generic parallel gripper. Complementary to existing research on developing grippers and control algorithms, we present Roman, a suite of hardware design and software tool support for robotic engineers to create 3D printable mechanisms attached to everyday handheld objects, making them easier to be manipulated by conventional robotic arms. The Roman hardware comes with a versatile magnetic gripper that can snap on/off handheld objects and drive add-on mechanisms to perform tasks. Roman also provides software support to register and author control programs. To validate our approach, we designed and fabricated Roman mechanisms for 14 everyday objects/tasks presented within a design space and conducted expert interviews with robotic engineers indicating that Roman serves as a practical alternative for enabling robotic manipulation of everyday objects.2022JLJiahao Li et al.UCLADesktop 3D Printing & Personal FabricationCHI
Lessons Learned from Designing an AI-Enabled Diagnosis Tool for PathologistsDespite the promises of data-driven artificial intelligence (AI), little is known about how we can bridge the gulf between traditional physician-driven diagnosis and a plausible future of medicine automated by AI. Specifically, how can we involve AI usefully in physicians’ diagnosis workflow given that most AI is still nascent and error-prone (e.g., in digital pathology)? To explore this question, we first propose a series of collaborative techniques to engage human pathologists with AI given AI’s capabilities and limitations, based on which we prototype Impetus—a tool where an AI takes various degrees of initiatives to provide various forms of assistance to a pathologist in detecting tumors from histological slides. We summarize observations and lessons learned from a study with eight pathologists and discuss recommendations for future work on human-centered medical AI systems.2021HGHongyan Gu et al.Human-AI CollaborationCSCW
XAlgo: a Design Probe of Explaining Algorithms' Internal States via Question-AnsweringAlgorithms often appear as 'black boxes' to non-expert users. While prior work focuses on explainable representations and expert-oriented exploration, we propose and study an interactive approach using question answering to explain deterministic algorithms to non-expert users who need to understand the algorithms' internal states (e.g., students learning algorithms, operators monitoring robots, admins troubleshooting network routing). We construct XAlgo---a formal model that first classifies the type of question based on a taxonomy and generates an answer based on a set of rules that extract information from representations of an algorithm's internal states, e.g., the pseudocode. A design probe in an algorithm learning scenario with 18 participants (9 for a Wizard-of-Oz XAlgo and 9 as a control group) reports findings and design implications based on what kinds of questions people ask, how well XAlgo responds, and what remain as challenges to bridge users' gulf of understanding algorithms.2021JRJuan Rebanal et al.Explainable AI (XAI)Algorithmic Transparency & AuditabilityUser Research Methods (Interviews, Surveys, Observation)IUI
Romeo: A Design Tool for Embedding Transformable Parts in 3D Models to Robotically Augment Default FunctionalitiesReconfiguring shapes of objects enables transforming existing passive objects with robotic functionalities, e.g., a transformable coffee cup holder can be attached to a chair’s armrest, a piggybank can reach out an arm to ’steal’ coins. Despite the advance in end-user 3D design and fabrication, it remains challenging for non-experts to create such ‘transformables’ using existing tools due to the requirement of specific engineering knowledge such as mechanisms and robotic design. We present Romeo—a design tool for creating transformables embedded into a 3D model to robotically augment the object’s default functionalities. Romeo allows users to express at a high level, (1) which part of the object to be transformed, (2) how it moves following motion points in space, and (3) the corresponding action to be taken. Romeo then automatically generates a robotic arm embedded in the transformable part ready for fabrication. We validated Romeo with a design session where 8 participants design and create custom transformables using 3D objects of their own choice.2020JLJiahao Li et al.Desktop 3D Printing & Personal FabricationCustomizable & Personalized ObjectsMakerspace CultureUIST
Robiot: A Design Tool for Actuating Everyday Objects with Automatically Generated 3D Printable MechanismsUsers can now easily communicate digital information with an Internet of Things; in contrast, there remains a lack of support to automate physical tasks that involve legacy static objects,e.g., adjusting a desk lamp’s angle for optimal brightness, turning on/off a manual faucet when washing dishes, sliding a window to maintain a preferred indoor temperature. Automating these simple physical tasks has the potential to improve people’s quality of life, which is particularly important for people with a disability or in situational impairment. We present Robiot—a design tool for generating mechanisms that can be attached to, motorized, and actuating legacy static objects to perform simple physical tasks. Users only need to take a short video manipulating an object to demonstrate an intended physical behavior. Robiot then extracts requisite parameters and automatically generates 3D models of the enabling actuation mechanisms by performing a scene and motion analysis of the 2D video in alignment with the object’s3D model. In an hour long design session, six participants used Robiot to actuate seven everyday objects, imbuing them with the robotic capability to automate various physical tasks.2019JLJiahao Li et al.Desktop 3D Printing & Personal FabricationCircuit Making & Hardware PrototypingUIST