Black Older Adults’ Perception of Using Voice Assistants to Enact a Medical Recovery CurriculumThe use of interactive voice assistants (IVAs) in healthcare provides an avenue to address diverse health needs, such as gaps in the medical recovery period for older adult patients who have recently experienced serious illness. By using a voice-assisted medical recovery curriculum, discharged patients can receive ongoing support as they recover. However, there exist significant medical and technology disparities among older adults, particularly among Black older adults. We recruited 26 Black older adults to participate in the design process of an IVA-enacted medical recovery curriculum by providing feedback during the early stages of design. Lack of cultural relevancy, accountability, privacy concerns, and stigmas associated with aging and disability made participants reluctant to engage with the technology unless in a position of extreme need. This study underscored the need for Black cultural representation, whether it regarded the IVA's accent, the types of media featured, or race-specific medical advice, and the need for strategies to address participants' concerns and stigmas. Participants saw the value in the curriculum for those who did not have caregivers and deliberated about the trade-offs the technology presented. We discuss tensions surrounding inclusion and representation and conclude by showing how we enacted the lessons from this study in future design plans.2025AGAndrea Green et al.Supporting Older Adults' CareCSCW
Ontologies in Design: How Imagining a Tree Reveals Possibilities and Assumptions in Large Language ModelsAmid the recent uptake of Generative AI, sociotechnical scholars and critics have traced a multitude of resulting harms, with analyses largely focused on values and axiology (e.g., bias). While value-based analyses are crucial, we argue that ontologies—concerning what we allow ourselves to think or talk about—is a vital but under-recognized dimension in analyzing these systems. Proposing a need for a practice-based engagement with ontologies, we offer four orientations for considering ontologies in design: pluralism, groundedness, liveliness, and enactment. We share examples of potentialities that are opened up through these orientations across the entire LLM development pipeline by conducting two ontological analyses: examining the responses of four LLM-based chatbots in a prompting exercise, and analyzing the architecture of an LLM-based agent simulation. We conclude by sharing opportunities and limitations of working with ontologies in the design and development of sociotechnical systems.2025NHNava Haghighi et al.Stanford University, Computer ScienceHuman-LLM CollaborationTechnology Ethics & Critical HCICHI
GenieWizard: Multimodal App Feature Discovery with Large Language ModelsMultimodal interactions are more flexible, efficient, and adaptable than graphical interactions, allowing users to execute commands beyond simply tapping GUI buttons. However, the flexibility of multimodal commands makes it hard for designers to prototype and provide design specifications for developers. It is also hard for developers to anticipate what actions users may want. We present GenieWizard, a tool to aid developers in discovering potential features to implement in multimodal interfaces. GenieWizard supports user-desired command discovery early in the implementation process, streamlining the development process. GenieWizard uses an LLM to generate potential user interactions and parse these interactions into a form that can be used to discover the missing features for developers. Our evaluations showed that GenieWizard can reliably simulate user interactions and identify missing features. Also, in a study (N = 12), we demonstrated that developers using GenieWizard can identify and implement 42% of the missing features of multimodal apps compared to only 10% without GenieWizard.2025JYJackie (Junrui) Yang et al.Stanford University, Computer ScienceFull-Body Interaction & Embodied InputHuman-LLM CollaborationPrototyping & User TestingCHI
On Stress: Combining Human Factors and Biosignals to Inform the Placement and Design of a Skin-like Stress SensorWith advances in electronic-skin and wearable technologies, it is possible to continuously measure stress markers from the skin and sweat to monitor and improve wellbeing and health. Understandably, the sensor's engineering and resolution are important towards its function. However, we find that people looking for an e-skin stress sensor may look beyond measurement precision, demanding a private and stealth design to reduce, for example, social stigmatization. We introduce the idea of a stress sensing "wear index," created from the combination of human-centered design (n=24), physiological (n=10), and biochemical (n=16) data. This wear index can inform the design of stress wearables to fit specific applications, e.g., human factors may be relevant for a wellbeing application, versus a relapse prevention application that may require more sensing precision. Our wear index idea can be further generalized as a method to close gaps between design and engineering practices.2024YKYasser Khan et al.University of Southern CaliforniaHaptic WearablesSleep & Stress MonitoringCHI
Scientific and Fantastical: Creating Immersive, Culturally Relevant Learning Experiences with Augmented Reality and Large Language ModelsMotivating children to learn is a major challenge in education. One way to inspire motivation to learn is through immersion. We combine the immersive potential of augmented reality (AR), narrative, and large language models (LLMs) to bridge fantasy with reality in a mobile application, Moon Story, that teaches elementary schoolers astronomy and environmental science. Our system also builds upon learning theories such as culturally-relevant pedagogy. Using our application, a child embarks on a journey inspired by Chinese mythology, engages in real-world AR activities, and converses with a fictional character powered by a LLM. We conducted a controlled experiment (N=50) with two conditions: one using an LLM and one that was hard-coded. Both conditions resulted in learning gains, high engagement levels, and increased science learning motivation. Participants in the LLM condition also wrote more relevant answers. Finally, participants of both Chinese and non-Chinese heritage found the culturally-based narrative compelling.2024ACAlan Y. Cheng et al.Stanford UniversityAR Navigation & Context AwarenessHuman-LLM CollaborationCHI
Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooMData analysts have long sought to turn unstructured text data into meaningful concepts. Though common, topic modeling and clustering focus on lower-level keywords and require significant interpretative work. We introduce concept induction, a computational process that instead produces high-level concepts, defined by explicit inclusion criteria, from unstructured text. For a dataset of toxic online comments, where a state-of-the-art BERTopic model outputs “women, power, female,” concept induction produces high-level concepts such as “Criticism of traditional gender roles” and “Dismissal of women's concerns.” We present LLooM, a concept induction algorithm that leverages large language models to iteratively synthesize sampled text and propose human-interpretable concepts of increasing generality. We then instantiate LLooM in a mixed-initiative text analysis tool, enabling analysts to shift their attention from interpreting topics to engaging in theory-driven analysis. Through technical evaluations and four analysis scenarios ranging from literature review to content moderation, we find that LLooM’s concepts improve upon the prior art of topic models in terms of quality and data coverage. In expert case studies, LLooM helped researchers to uncover new insights even from familiar datasets, for example by suggesting a previously unnoticed concept of attacks on out-party stances in a political social media dataset.2024MLMichelle S. Lam et al.Stanford UniversityHuman-LLM CollaborationTechnology Ethics & Critical HCIComputational Methods in HCICHI
Visual StoryCoder: A Multimodal Programming Environment for Children's Creation of StoriesComputational thinking (CT) education reaches only a fraction of young children, in part because CT learning tools often require expensive hardware or fluent literacy. Block-based programming environments address these challenges through symbolic graphical interfaces, but users often need instructor support to advance. Alternatively, voice-based tools provide direct instruction on CT concepts but can present memory and navigation challenges to users. In this work, we present Visual StoryCoder, a multimodal tablet application that combines the strengths of each of these approaches to overcome their respective weaknesses. Visual StoryCoder introduces children ages 5–8 to CT through creative storytelling, offers direct instruction via a pedagogical voice agent, and eases use through a block-like graphical interface. In a between-subjects evaluation comparing Visual StoryCoder to a leading block-based programming app for this age group (N=24), we show that Visual StoryCoder is more understandable to independent learners, leads to higher-quality code after app familiarization, and encourages personally meaningful projects.2023GDGriffin Dietz et al.Stanford UniversityK-12 Digital Education ToolsEarly Childhood Education TechnologySTEM Education & Science CommunicationCHI
Designing Immersive, Narrative-Based Interfaces to Guide Outdoor LearningOutdoor learning experiences, such as field trips, can improve children’s science achievement and engagement, but these experiences are often difficult to deliver without extensive support. Narrative in educational experiences can provide needed structure, while also increasing engagement. We created a narrative-based, mobile application to investigate how to guide young learners in interacting with their local, outdoor environment. In a second variant, we added augmented reality and image classification to explore the value of these features. A study (n=44) found that participants using our system demonstrated learning gains and found the experience engaging. Our findings identified several major themes, including participant excitement for hands-on interactions with nature, curiosity about the characters, and enthusiasm toward typing their thoughts and observations. We offer a set of design implications for supporting narrative-based, outdoor learning with immersive technology.2023ACAlan Y. Cheng et al.Stanford UniversityAR Navigation & Context AwarenessGame UX & Player BehaviorCollaborative Learning & Peer TeachingCHI
Model Sketching: Centering Concepts in Early-Stage Machine Learning Model DesignMachine learning practitioners often end up tunneling on low-level technical details like model architectures and performance metrics. Could early model development instead focus on high-level questions of which factors a model ought to pay attention to? Inspired by the practice of sketching in design, which distills ideas to their minimal representation, we introduce model sketching: a technical framework for iteratively and rapidly authoring functional approximations of a machine learning model's decision-making logic. Model sketching refocuses practitioner attention on composing high-level, human-understandable concepts that the model is expected to reason over (e.g., profanity, racism, or sarcasm in a content moderation task) using zero-shot concept instantiation. In an evaluation with 17 ML practitioners, model sketching reframed thinking from implementation to higher-level exploration, prompted iteration on a broader range of model designs, and helped identify gaps in the problem formulation—all in a fraction of the time ordinarily required to build a model.2023MLMichelle S. Lam et al.Stanford UniversityAI-Assisted Decision-Making & AutomationComputational Methods in HCICHI
End-User Audits: A System Empowering Communities to Lead Large-Scale Investigations of Harmful Algorithmic BehaviorBecause algorithm audits are conducted by technical experts, audits are necessarily limited to the hypotheses that experts think to test. End users hold the promise to expand this purview, as they inhabit spaces and witness algorithmic impacts that auditors do not. In pursuit of this goal, we propose end-user audits—system-scale audits led by non-technical users—and present an approach that scaffolds end users in hypothesis generation, evidence identification, and results communication. Today, performing a system-scale audit requires substantial user effort to label thousands of system outputs, so we introduce a collaborative filtering technique that leverages the algorithmic system's own disaggregated training data to project from a small number of end user labels onto the full test set. Our end-user auditing tool, IndieLabel, employs these projected labels so that users can rapidly explore where their opinions diverge from the algorithmic system's outputs. By highlighting topic areas where the system is under-performing for the user and surfacing sets of likely error cases, the tool guides the user in authoring an audit report. In an evaluation of end-user audits on a popular comment toxicity model with 17 non-technical participants, participants both replicated issues that formal audits had previously identified and also raised previously underreported issues such as under-flagging on veiled forms of hate that perpetuate stigma and over-flagging of slurs that have been reclaimed by marginalized communities.2022MLMichelle S. Lam et al.Algorithmic Decision-makingCSCW
Beyond Being Real: A Sensorimotor Control Perspective on Interactions in Virtual RealityWe can create Virtual Reality (VR) interactions that have no equivalent in the real world by remapping spacetime or altering users' body representation, such as stretching the user’s virtual arm for manipulation of distant objects or scaling up the user’s avatar to enable rapid locomotion. Prior research has leveraged such approaches, what we call beyond-real techniques, to make interactions in VR more practical, efficient, ergonomic, and accessible. We present a survey categorizing prior movement-based VR interaction literature as reality-based, illusory, or beyond-real interactions. We survey relevant conferences (CHI, IEEE VR, VRST, UIST, and DIS) while focusing on selection, manipulation, locomotion, and navigation in VR. For beyond-real interactions, we describe the transformations that have been used by prior works to create novel remappings. We discuss open research questions through the lens of the human sensorimotor control system and highlight challenges that need to be addressed for effective utilization of beyond-real interactions in future VR applications, including plausibility, control, long-term adaptation, and individual differences.2022PAParastoo Abtahi et al.Stanford UniversityImmersion & Presence ResearchCHI
HybridTrak: Adding Full-Body Tracking to VR Using an Off-the-Shelf WebcamFull-body tracking in virtual reality improves presence, allows interaction via body postures, and facilitates better social expression among users. However, full-body tracking systems today require a complex setup fixed to the environment (e.g., multiple lighthouses/cameras) and a laborious calibration process, which goes against the desire to make VR systems more portable and integrated. We present HybridTrak, which provides accurate, real-time full-body tracking by augmenting inside-out upper-body VR tracking systems with a single external off-the-shelf RGB web camera. HybridTrak converts and transforms users' 2D full-body poses from the webcam to 3D poses leveraging the inside-out upper-body tracking data with a full-neural solution. We showed HybridTrak is more accurate than RGB or depth-based tracking method on the MPI-INF-3DHP dataset. We also tested HybridTrak in the popular VRChat app and showed that body postures presented by HybridTrak are more distinguishable and more natural than a solution using an RGBD camera.2022JYJackie (Junrui) Yang et al.Stanford UniversityFull-Body Interaction & Embodied InputImmersion & Presence ResearchCHI
EnglishBot: An AI-Powered Conversational Interface for Second Language LearningToday, many students learn to speak a foreign language by listening to and repeating pre-recorded materials. This is due to the lack of practice opportunities with human partners. Leveraging recent advancements in AI, Speech, and NLP, we developed EnglishBot, a language learning chatbot that converses with students interactively on college-related topics and provides adaptive feedback. We evaluated EnglishBot against a traditional listen-and-repeat interface with 56 Chinese college students through two six-day user studies under both voluntary and fixed-usage conditions. Results show that students were more engaged with EnglishBot and voluntarily spent 2.1 times more time interacting with it. Students’ fluency also improved more with EnglishBot under the IELTS grading standard. Our results suggest that chatbots are an effective learning tool to engage students and have great potential to enhance foreign learners’ speaking abilities.2021SRSherry Ruan et al.Conversational ChatbotsIntelligent Tutoring Systems & Learning AnalyticsSTEM Education & Science CommunicationIUI
StoryCoder: Teaching Computational Thinking Concepts Through Storytelling in a Voice-Guided App for ChildrenComputational thinking (CT) education reaches only a fraction of young children, in part because CT learning tools often require expensive hardware or fluent literacy. Informed by needfinding interviews, we developed a voice-guided smartphone application leveraging storytelling as a creative activity by which to teach CT concepts to 5- to 8-year-old children. The app includes two storytelling games where users create and listen to stories as well as four CT games where users then modify those stories to learn about sequences, loops, events, and variables. We improved upon the app design through wizard-of-oz testing (N=28) and iterative design testing (N=22) before conducting an evaluation study (N=22). Children were successfully able to navigate the app, effectively learn about the target computing concepts, and, after using the app, children demonstrated above-chance performance on a near transfer CT concept recognition task.2021GDGriffin Dietz et al.Stanford UniversityVoice User Interface (VUI) DesignProgramming Education & Computational ThinkingK-12 Digital Education ToolsCHI
DoThisHere: Using multi-modal interaction to support cross-application tasks on mobile devicesMany computing tasks, such as comparison shopping, twofactor authentication, and checking movie reviews, require using multiple apps together. On large screens, “windows, icons, menus, pointer” (WIMP) graphical user interfaces (GUIs) support easy sharing of content and context between multiple apps. So, it is easy to see the content from one application and write something relevant in another application, such as looking at the map around a place and typing walking instructions into an email. However, although today’s smartphones also use GUIs, they have small screens and limited windowing support, making it hard to switch contexts and exchange data between apps. We introduce DoThisHere, a multimodal interaction technique that streamlines cross-app tasks and reduces the burden thesetasks impose on users. Users can use voice to refer to information or app features that are off-screen and touch to specify where the relevant information should be inserted or is displayed. With DoThisHere, users can access information from or carry information to other apps with less context switching. We conducted a survey to find out what cross-app tasks people are performing or wish to perform on their smartphones. Among the 125 tasks that we collected from 75 participants, we found that 59 of these tasks are not well supported currently. DoThisHere is helpful in completing 95% of these unsupported tasks. A user study, where users are shown the list of supported voice commands when performing a representative sample of such tasks, suggests that DoThisHere may reduce expert users’ cognitive load; the Query action, in particular, can help users reduce task completion time.2020JYJackie (Junrui) Yang et al.Voice User Interface (VUI) DesignKnowledge Worker Tools & WorkflowsUIST
Designing Ambient Narrative-Based Interfaces to Reflect and Motivate Physical ActivityNumerous technologies now exist for promoting more active lifestyles. However, while quantitative data representations (e.g., charts, graphs, and statistical reports) typify most health tools, growing evidence suggests such feedback can not only fail to motivate behavior but may also harm self-integrity and fuel negative mindsets about exercise. Our research seeks to devise alternative, more qualitative schemes for encoding personal information. In particular, this paper explores the design of data-driven narratives, given the intuitive and persuasive power of stories. We present WhoIsZuki, a smartphone application that visualizes physical activities and goals as components of a multi-chapter quest, where the main character's progress is tied to the user's. We report on our design process involving online surveys, in-lab studies, and in-the-wild deployments, aimed at refining the interface and the narrative and gaining a deep understanding of people's experiences with this type of feedback. From these insights, we contribute recommendations to guide future development of narrative-based applications for motivating healthy behavior.2020EMElizabeth L. Murnane et al.Stanford UniversityData StorytellingGamification DesignCHI
Soundr: Head Position and Orientation Prediction Using a Microphone ArrayAlthough state-of-the-art smart speakers can hear a user's speech, unlike a human assistant these devices cannot figure out users' verbal references based on their head location and orientation. Soundr presents a novel interaction technique that leverages the built-in microphone array found in most smart speakers to infer the user's spatial location and head orientation using only their voice. With that extra information, Soundr can figure out users references to objects, people, and locations based on the speakers' gaze, and also provide relative directions. To provide training data for our neural network, we collected 751 minutes of data (50x that of the best prior work) from human speakers leveraging a virtual reality headset to accurately provide head tracking ground truth. Our results achieve an average positional error of 0.31m and an orientation angle accuracy of 34.3° for each voice command. A user study to evaluate user preferences for controlling IoT appliances by talking at them found this new approach to be fast and easy to use.2020JYJackie (Junrui) Yang et al.Stanford UniversityEye Tracking & Gaze InteractionHome Voice Assistant ExperienceCHI
Adaptive Photographic Composition GuidancePhotographic composition is often taught as alignment with composition grids—most commonly, the rule of thirds. Professional photographers use more complex grids, like the harmonic armature, to achieve more diverse dynamic compositions. We are interested in understanding whether these complex grids are helpful to amateurs. In a formative study, we found that overlaying the harmonic armature in the camera can help less experienced photographers discover and achieve different compositions, but it can also be overwhelming due to the large number of lines. Photographers actually use subsets of lines from the armature to explain different aspects of composition. However, this occurs mainly offline to analyze existing images. We propose bringing this mental model into the camera—by adaptively highlighting relevant lines to the current scene and point of view. We describe a saliency-based algorithm for selecting these lines and present an evaluation of the system that shows that photographers found the proposed adaptive armatures helpful for capturing more well-composed images.2020JEJane L. E et al.Stanford UniversityPhotography & Image ProcessingCHI
QuizBot: A Dialogue-based Adaptive Learning System for Factual KnowledgeAdvances in conversational AI have the potential to enable more engaging and effective ways to teach factual knowledge. To investigate this hypothesis, we created QuizBot, a dialogue-based agent that helps students learn factual knowledge in science, safety, and English vocabulary. We evaluated QuizBot with 76 students through two within-subject studies against a flashcard app, the traditional medium for learning factual knowledge. Though both systems used the same algorithm for sequencing materials, QuizBot led to students recognizing (and recalling) over 20% more correct answers than when students used the flashcard app. Using a conversational agent is more time consuming to practice with, but in a second study, of their own volition, students spent 2.6x more time learning with QuizBot than with flashcards and reported preferring it strongly for casual learning. Our results in this second study showed QuizBot yielded improved learning gains over flashcards on recall. These results suggest that educational chatbot systems may have beneficial use, particularly for learning outside of traditional settings.2019SRSherry Ruan et al.Stanford UniversityConversational ChatbotsIntelligent Tutoring Systems & Learning AnalyticsCHI
Poirot: A Web Inspector for DesignersTo better understand the issues designers face as they interact with developers and use developer tools to create websites, we conducted a formative investigation consisting of interviews, a survey, and an analysis of professional design documents. Based on insights gained from these efforts, we developed Poirot, a web inspection tool for designers that enables them to make style edits to websites using a familiar graphical interface. We compared Poirot to Chrome DevTools in a lab study with 16 design professionals. We observed common problems designers experience when using Chrome DevTools and found that when using Poirot, designers were more successful in accomplishing typical design tasks (97% to 63%). In addition, we found that Poirot had a significantly lower perceived cognitive load and was overwhelmingly preferred by the designers in our study.2019KTKesler Tanner et al.Stanford UniversityPrototyping & User TestingCHI