TeachTune: Reviewing Pedagogical Agents Against Diverse Student Profiles with Simulated StudentsLarge language models (LLMs) can empower teachers to build pedagogical conversational agents (PCAs) customized for their students. As students have different prior knowledge and motivation levels, teachers must review the adaptivity of their PCAs to diverse students. Existing chatbot reviewing methods (e.g., direct chat and benchmarks) are either manually intensive for multiple iterations or limited to testing only single-turn interactions. We present TeachTune, where teachers can create simulated students and review PCAs by observing automated chats between PCAs and simulated students. Our technical pipeline instructs an LLM-based student to simulate prescribed knowledge levels and traits, helping teachers explore diverse conversation patterns. Our pipeline could produce simulated students whose behaviors correlate highly to their input knowledge and motivation levels within 5% and 10% accuracy gaps. Thirty science teachers designed PCAs in a between-subjects study, and using TeachTune resulted in a lower task load and higher student profile coverage over a baseline.2025HJHyoungwook Jin et al.KAIST, School of ComputingGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationIntelligent Tutoring Systems & Learning AnalyticsCHI
Open Sesame? Open Salami! Personalizing Vocabulary Assessment-Intervention for Children via Pervasive Profiling and Bespoke Storybook GenerationChildren acquire language by interacting with their surroundings. Due to the different language environments each child is exposed to, the words they encounter and need in their life vary. Despite the standard tools for assessment and intervention as per predefined vocabulary sets, speech-language pathologists and parents struggle with the absence of systematic tools for child-specific custom vocabulary, i.e., out-of-standard but personally more important. We propose "Open Sesame? Open Salami! (OSOS)", a personalized vocabulary assessment and intervention system with pervasive language profiling and targeted storybook generation, collaboratively developed with speech-language pathologists. Melded into a child's daily life and powered by large language models (LLM), OSOS profiles the child's language environment, extracts priority words therein, and generates bespoke storybooks naturally incorporating those words. We evaluated OSOS through 4-week-long deployments to 9 families. We report their experiences with OSOS, and its implications in supporting personalization outside standards.2024JLJungeun Lee et al.POSTECHMultilingual & Cross-Cultural Voice InteractionGenerative AI (Text, Image, Music, Video)Special Education TechnologyCHI
Utilizing a Dense Video Captioning Technique for Generating Image Descriptions of Comics for People with Visual ImpairmentsTo improve the accessibility of visual figures, auto-generation of text description of individual images has been studied. However, it cannot be directly applied to comics as the descriptions can be redundant as similar scenes appear in a row. To address this issue, we propose generating the descriptions per group of related images and demonstrate how an dense captioning technique for videos can be utilized for this purpose and ways to improve its performance. To assess the effectiveness of our approach and to identify factors affecting the quality of text descriptions of comics, we conducted a preliminary study with 3 sighted evaluators and a main user study with 12 participants with visual impairments. The results show that text descriptions generated per group of images are perceived to be better than those generated per image in terms of accuracy, clarity, understandability, length, informativeness and preference for sighted groups, when annotator is human. In the same conditions, when the annotator is AI, it exhibited better performance in terms of length. Also, people with visual impairments prefer group descriptions because of conciseness, smooth connectivity of sentences, and non-repetitive features. Based on the findings, we provide design recommendations for generating accessible comic descriptions at a scale for blind users.2024SKSuhyun Kim et al.Visual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)Visualization Perception & CognitionIUI
Understanding Novice's Annotation Process For 3D Semantic Segmentation Task With Human-In-The-LoopLarge-scale 3D point clouds are often used as training data for 3D semantic segmentation, but the labor-intensive nature of the annotation process challenges the acquisition of sufficient labeled data. Meanwhile, there has been limited research on introducing novice annotators to acquire the labeled data by enhancing their annotation performance and user experience. Therefore, in this study, we explored solutions involving two dimensions: the presence of AI assistance and the number of classes visualized simultaneously in model's segmentation results in HITL. We conducted a user study with 16 novice annotators who had no prior experience in 3D semantic segmentation, asking them to perform annotation tasks. The results revealed an interaction effect between the two dimensions on annotation accuracy and labeling efficiency. We also found that displaying multiple classes at once reduced the time taken for annotation. Moreover, visualizing multiple classes at once or the absence of AI assistance led to a greater increase in model accuracy compared to our baselines. The best user experience was observed when the visualization showed a single class at a time with AI assistance. Based on these findings, we discuss which environments can enhance novice annotators' annotation performance and user experience in 3D semantic segmentation tasks within HITL contexts.2024YKYujin Kim et al.AI-Assisted Decision-Making & AutomationCrowdsourcing Task Design & Quality ControlIUI
Understanding tensions in music accessibility through song signing for and with d/Deaf and Non-d/Deaf personsSong signing is a method practiced by people who are d/Deaf and non-d/Deaf individuals to visually represent music and make music accessible through sign language and body movements. Although there is growing interest in song signing, there is a lack of understanding on what d/Deaf people value about song signing and how to make song signing productions they would consider acceptable. We conducted semi-structured interviews with 12 d/Deaf participants to gain a deeper understanding of what they value in music and song signing. We then interviewed 14 song signers to understand their experiences and processes in creating song signing performances. From this study, we identify three complex, interrelated layers of the song signing creation process and discuss how they can be supported and completed to potentially bridge the cultural divide between the d/Deaf and non-d/Deaf audiences and guide more culturally responsive creation of music.2023SYSuhyeon Yoo et al.University of TorontoDeaf & Hard-of-Hearing Support (Captions, Sign Language, Vibration)Programming Education & Computational ThinkingCHI
"A Voice that Suits the Situation": Understanding the Needs and Challenges for Supporting End-User Voice CustomizationAlthough there is a potential demand for customizing voices, most customization is limited to the visual appearance of a figure (e.g., avatars). To better understand the users' needs, we first conducted an online survey with 104 participants. Then we conducted a semi-structured interview with a prototype with 14 participants to identify design considerations for supporting voice customization. The results show that there is a desire for voice customization especially for non-face-to-face conversations with someone unfamiliar. In addition, the findings revealed that different voices are favored for different contexts from a better version of one's own voice for improving delivery to a completely different voice for securing identity. As future work, we plan to extend this study by investigating voice synthesis techniques for end-users who wish to design their own voices for various contexts.2022HBHyeon Jeong Byeon et al.Ewha Womans UniversityAgent Personality & AnthropomorphismGenerative AI (Text, Image, Music, Video)CHI
Cocomix: Utilizing Comments to Improve Non-Visual Webtoon AccessibilityWebtoon is a type of digital comics read online where readers can leave comments to share their thoughts on the story. While it has experienced a surge in popularity internationally, people with visual impairments cannot enjoy webtoon with the lack of an accessible format. While traditional image description practices can be adopted, resulting descriptions cannot preserve webtoons' unique values such as control over the reading pace and social engagement through comments. To improve the webtoon reading experience for BLV users, we propose Cocomix, an interactive webtoon reader that leverages comments into the design of novel webtoon interactions. Since comments can identify story highlights and provide additional context, we designed a system that provides 1) comments-based adaptive descriptions with selective access to details and 2) panel-anchored comments for easy access to relevant descriptive comments. Our evaluation (N=12) showed that Cocomix users could adapt the description for various needs and better utilize comments.2022MHMina Huh et al.KAISTConversational ChatbotsVisual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)CHI
ReCog: Supporting Blind People in Recognizing Personal ObjectsWe present ReCog, a mobile app that enables blind users to recognize objects by training a deep network with their own photos of such objects. This functionality is useful to differentiate personal objects, which cannot be recognized with pre-trained recognizers and may lack distinguishing tactile features. To ensure that the objects are well-framed in the captured photos, ReCog integrates a camera-aiming guidance that tracks target objects and instructs the user through verbal and sonification feedback to appropriately frame them.<br>We report a two-session study with 10 blind participants using ReCog for object training and recognition, with and without guidance. We show that ReCog enables blind users to train and recognize their personal objects, and that camera-aiming guidance helps novice users to increase their confidence, achieve better accuracy, and learn strategies to capture better photos.2020DADragan Ahmetovic et al.Università degli studi di MilanoVibrotactile Feedback & Skin StimulationVoice User Interface (VUI) DesignVisual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)CHI
BebeCode: Collaborative Child Development Tracking SystemContinuous tracking young children's development is important for parents because early detection of developmental delay can lead to better treatment through early intervention. Screening tests, often based on questions answered by a parent, are used to assess children's development, but responses from only one parent can be subjective and even inaccurate due to limited memory and observations. In this work, we propose a collaborative child development tracking system, where screening test responses are collected through collaboration between parents or caregivers. We implement BebeCODE, a mobile system that encourages parents to independently answer all developmental questions for a given age and resolve disagreements through chatting, image/video sharing, or asking a third person. A 4-week deployment study of BebeCODE with 12 families found that parents had approximately 22% disagreements about questions regarding their children's developmental and BebeCODE helped them reach a consensus. Parents also reported that their awareness of their child's development, increased with BebeCODE.2018SSSeokwoo Song et al.KAIST (Korea Advanced Institute of Science and Technology)Early Childhood Education TechnologySpecial Education TechnologyCHI