Ability Heuristics for Conducting Accessibility InspectionsThe accessibility of interactive technologies is often evaluated using checklists that are low-level, numerous, and platform-specific. Such checklists are typically used by accessibility experts, leaving everyday designers and developers with little support for assessing their own interfaces. To make accessibility evaluations easier to conduct, we devised a set of nine ``ability heuristics'' that prompt designers to engage with accessibility throughout the design process. We empirically evaluated these ability heuristics with 37 design students, comparing them to usability heuristics and WCAG. The ability heuristics emphasized the quality of accessibility features compared to the other methods, and surfaced issues that were more broadly dispersed across disability groups. Further, the students found the heuristics were as easy to use as the alternative methods. We argue that the heuristics help to move beyond binary notions of accessibility, pushing designers to consider the quality of features across diverse disabilities and the range of abilities within.2026CMClaire L. Mitchell et al.University of WashingtonUniversal & Inclusive DesignCognitive Impairment & Neurodiversity (Autism, ADHD, Dyslexia)Participatory DesignCHI
Quantifying the Novelty Bias when Evaluating Interactive PrototypesExperiments in human-computer interaction (HCI) often evaluate whether a prototype is “better,” but novelty alone can affect users’ judgments and possibly performance. To quantify this effect, we conducted a within-subjects study of 48 participants comparing four pairs of functionally identical prototypes (mice, keyboards, search engines, and AI chatbots). Each pair differed only in cosmetic features and a label marking one as “old” and the other as “new.” Novelty labeling shifted preference: up to 77% favored the version labeled “new.” Subjective ratings for the search engine increased under the “new” label by up to 7.1%. For the AI chatbot, ratings were driven by preference, with the preferred version rated up to 11.6% higher than the unpreferred one. Performance differences were modest and emerged for errors (e.g., 9.7% fewer misses with the “new” mouse, up to 7.2% lower error rates with the “new” keyboard). Technology readiness predicted baseline skill and occasionally moderated performance but did not protect judgments from novelty bias. These results show that novelty labeling reframes interpretation and preference more than performance, raising concerns for HCI evaluations relying on participant judgments.2026YMYumeng Ma et al.University of WashingtonUser Research Methods (Interviews, Surveys, Observation)Prototyping & User TestingResearch Ethics & Open ScienceCHI
A Framework for Adapting In-Car Touchscreen Interfaces to Driver Behaviors, Perception, and CognitionAlthough in-car touchscreens expand interaction possibilities, they risk compromising driver safety and vigilance. We propose a data- and expert-informed framework for designing adaptive touchscreens that respond to a driver’s usage profile and cognitive state, maximizing usability while mitigating safety risks. First, in a driving simulator study, we find that cognitive load slows touchscreen button selections by 20\% and produced shorter, more frequent off-road glances. We also find that enlarging buttons improves selection speeds by 0.3 seconds but at the cost of requiring more display pages. Next, these findings informed a co-design session with expert in-cabin designers, generating guidelines for adaptive interfaces that balance usability and safety. These guidelines form the basis of our Profile-State Adaptive (PSA) framework, which integrates driver profiles with cognitive states to guide interface adaptations. We then extend the framework to include a quantitative Time-Cost model as well as design patterns for adaptive layouts across usage profiles and cognitive demands.2026SHSeokhyun Hwang et al.University of WashingtonAutomated Driving Interface & Takeover DesignHead-Up Display (HUD) & Advanced Driver Assistance Systems (ADAS)In-Vehicle Haptic, Audio & Multimodal FeedbackCHI
TaskAudit: Detecting Functiona11ity Errors in Mobile Apps via Agentic Task ExecutionAccessibility checkers are tools in support of accessible app development, and their use is encouraged by accessibility best practices. However, most current checkers evaluate static or mechanically-generated contexts, failing to capture common accessibility errors impacting mobile app functionality. In this work, we define functiona11ity errors as accessibility barriers that only manifest through interaction (i.e., named according to a blend of “functionality” and “accessibility”). We introduce TaskAudit, which comprises three components: a Task Generator that constructs interactive tasks from app screens, a Task Executor that uses agents with a screen reader proxy to perform these tasks, and an Accessibility Analyzer that detects and reports accessibility errors by examining interaction traces. Our evaluation on real-world apps shows that TaskAudit detects 48 functiona11ity errors from 54 app screens, compared to between 4 and 20 with existing checkers. Our analysis demonstrates common error patterns that TaskAudit can detect in addition to those from prior work, including label-functionality mismatch, cluttered navigation, and inappropriate feedback.2026MZMingyuan Zhong et al.University of WashingtonVoice AccessibilityMobile Accessibility DesignPrivacy & Data Ownership in Self-TrackingCHI
Supporting Mobile Reading While Walking with Automatic and Customized Font Size AdaptationsThe pervasive use of mobile devices for information consumption makes reading on-the-go an unavoidable daily occurrence, whereby walking creates a natural situational impairment for reading. In this work, we quantify the impact of walking on reading performance and compare automatic system adaptations with user customizations for mitigating these impacts. We collected user interactions and mobile sensor data of reading while walking in a controlled lab study with 45 participants. We found that automatic font size adjustment by viewing distance mitigated the performance degradation from walking, yielding faster reading speed and increased comfort. Furthermore, exposure to the automatic adaptation functionality influences user customization behavior and preferences for reading while walking. We discuss implications and provide design suggestions for personalizing interfaces when reading on-the-go, including blending system recommendation with user customization, offering multiple points of customization through appropriately-timed prompts, and refining recommendations based on observed preferences.2025JKJunhan Kong et al.University of WashingtonVoice User Interface (VUI) DesignContext-Aware ComputingCHI
A11yBoard: Making Digital Artboards Accessible to Blind and Low-Vision UsersDigital artboards, which hold objects rather than pixels (e.g., Microsoft PowerPoint and Google Slides), remain largely inaccessible for blind and low-vision (BLV) users. Building on prior findings about the experiences of BLV users with digital artboards, we present a novel tool called A11yBoard, an interactive multimodal system that makes interpreting and authoring digital artboards accessible. A11yBoard combines a web-based drawing canvas paired with a mobile touch screen device such as a tablet. The mobile device displays the same canvas and enables risk-free spatial exploration of the artboard via touch and gesture. Speech recognition, non-speech audio, and keyboard-based commands are also used for input and output. Through a series of pilot studies and formal task-based user studies with BLV participants, we show that A11yBoard provides (1) intuitive spatial reasoning about two-dimensional objects, (2) multimodal access to objects’ properties and relationships, and (3) eyes-free creating and editing of objects to establish their desired properties and positions.2023ZZZhuohao Zhang et al.University of WashingtonVisual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)Visualization Perception & CognitionCHI
VoxLens: Making Online Data Visualizations Accessible with an Interactive JavaScript Plug-InJavaScript visualization libraries are widely used to create online data visualizations but provide limited access to their information for screen-reader users. Building on prior findings about the experiences of screen-reader users with online data visualizations, we present VoxLens, an open-source JavaScript plug-in that--with a single line of code--improves the accessibility of online data visualizations for screen-reader users using a multi-modal approach. Specifically, VoxLens enables screen-reader users to obtain a holistic summary of presented information, play sonified versions of the data, and interact with visualizations in a "drill-down" manner using voice-activated commands. Through task-based experiments with 21 screen-reader users, we show that VoxLens improves the accuracy of information extraction and interaction time by 122% and 36%, respectively, over existing conventional interaction with online data visualizations. Our interviews with screen-reader users suggest that VoxLens is a "game-changer" in making online data visualizations accessible to screen-reader users, saving them time and effort.2022ASAther Sharif et al.University of WashingtonVisual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)CHI
TypeAnywhere: A QWERTY-Based Text Entry Solution for Ubiquitous ComputingWe present a QWERTY-based text entry system, TypeAnywhere, for use in off-desktop computing environments. Using a wearable device that can detect finger taps, users can leverage their touch-typing skills from physical keyboards to perform text entry on any surface. TypeAnywhere decodes typing sequences based only on finger-tap sequences without relying on tap locations. To achieve optimal decoding performance, we trained a neural language model and achieved a 1.6% character error rate (CER) in an offline evaluation, compared to a 5.3% CER from a traditional n-gram language model. Our user study showed that participants achieved an average performance of 70.6 WPM, or 80.4% of their physical keyboard speed, and 1.50% CER after 2.5 hours of practice over five days on a table surface. They also achieved 43.9 WPM and 1.37% CER when typing on their laps. Our results demonstrate the strong potential of QWERTY typing as a ubiquitous text entry solution.2022MZMingrui Ray Zhang et al.University of WashingtonHaptic WearablesVoice User Interface (VUI) DesignIntelligent Voice Assistants (Alexa, Siri, etc.)CHI
A Large-Scale Longitudinal Analysis of Missing Label Accessibility Failures in Android AppsWe present the first large-scale longitudinal analysis of missing label accessibility failures in Android apps. We developed a crawler and collected monthly snapshots of 312 apps over 16 months. We use this unique dataset in empirical examinations of accessibility not possible in prior datasets. Key large-scale findings include missing label failures in 55.6% of unique image-based elements, longitudinal improvement in ImageButton elements but not in more prevalent ImageView elements, that 8.8% of unique screens are unreachable without navigating at least one missing label failure, that app failure rate does not improve with number of downloads, and that effective labeling is neither limited to nor guaranteed by large software organizations. We then examine longitudinal data in individual apps, presenting illustrative examples of accessibility impacts of systematic improvements, incomplete improvements, interface redesigns, and accessibility regressions. We discuss these findings and potential opportunities for tools and practices to improve label-based accessibility.2022RFRaymond Fok et al.University of WashingtonVoice AccessibilityVisual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)Universal & Inclusive DesignCHI
Understanding Blind Screen-Reader Users' Experiences of Digital ArtboardsTwo-dimensional canvases are the core components of many digital productivity and creativity tools, with "artboards" containing objects rather than pixels. Unfortunately, the contents of artboards remain largely inaccessible to blind users relying on screen-readers, but the precise problems are not well understood. This study sought to understand how blind screen-reader users interact with artboards. Specifically, we conducted contextual interviews, observations, and task-based usability studies with 15 blind participants to understand their experiences of artboards found in Microsoft PowerPoint, Apple Keynote, and Google Slides. Participants expressed that the inaccessibility of these artboards contributes to significant educational and professional barriers. We found that the key problems faced were: (1) high cognitive loads from a lack of feedback about artboard contents and object state; (2) difficulty determining relationships among artboard objects; and (3) constant uncertainty about whether object manipulations were successful. We offer design remedies that improve feedback for object state, relationships, and manipulations.2021ASAnastasia Schaadhardt et al.University of WashingtonVisual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)CHI
“I Am Iron Man” Priming Improves the Learnability and Memorability of User-Elicited GesturesPriming is used as a way of increasing the diversity of proposals in end-user elicitation studies, but priming has not been investigated thoroughly in this context. We conduct a distributed end-user elicitation study with 167 participants, which had three priming groups: a no-priming control group, sci-fi priming, and a creative mindset group. We evaluated the gestures proposed by these groups in a distributed learnability and memorability study with 18 participants. We found that the user-elicited gestures from the sci-fi group were significantly faster to learn, requiring an average of 1.22 viewings to learn compared to 1.60 viewings required to learn the control gestures, and 1.56 viewings to learn the gestures elicited from the creative mindset group. In addition, both primed gesture groups had higher memorability with 80% of the sci-fi-primed gestures and 73% of the creative mindset group gestures were recalled correctly after one week without practice compared to 43% of the control group gestures.2021AAAbdullah Ali et al.University of WashingtonHand Gesture RecognitionUser Research Methods (Interviews, Surveys, Observation)CHI
JustCorrect: Intelligent Post Hoc Text Correction Techniques on SmartphonesCorrecting errors in entered text is a common task but usually difficult to perform on mobile devices due to tedious cursor navigation steps. In this paper, we present JustCorrect, an intelligent post hoc text correction technique for smartphones. To make a correction, the user simply types the correct text at the end of their current input, and JustCorrect will automatically detect the error and apply the correction in the form of an insertion or a substitution. In this way, manual navigation steps are bypassed, and the correction can be committed with a single tap. We solved two critical problems to support JustCorrect: (1) Correction Algorithm: we propose an algorithm that infers the user’s correction intention from the last typed word. (2) Input Modalities: our study revealed that both tap and gesture were suitable input modalities for performing JustCorrect. Based on our findings, we integrated JustCorrect into a soft keyboard. Our user studies show that using JustCorrect reduces the text correction time by 12.8% over the stock Android keyboard and by 9.7% over the "Type, then Correct" text correction technique by Zhang et al. (2019). Overall, JustCorrect complements existing post hoc text correction techniques, making error correction more automatic and intelligent.2020WCWenzhe Cui et al.Voice User Interface (VUI) DesignUIST
A Systematic Review of Gesture Elicitation Studies: What Can We Learn from 216 Studies?Gesture elicitation studies represent a popular and resourceful method in HCI to inform the design of intuitive gesture commands, reflective of end-users’ behavior, for controlling all kinds of interactive devices, applications, and systems. In the last ten years, an impressive body of work has been published on this topic, disseminating useful design knowledge regarding users’ preferences for finger, hand, wrist, arm, head, leg, foot, and whole-body gestures. In this paper, we deliver a systematic literature review of this large body of work by summarizing the characteristics and findings ofN=216gesture elicitation studies subsuming 5,458 participants, 3,625 referents, and 148,340 elicited gestures. We highlight the descriptive, comparative, and generative virtues of our examination to provide practitioners with an effective method to (i) understand how new gesture elicitation studies position in the literature; (ii) compare studies from different authors; and (iii) identify opportunities for new research. We make our large corpus of papers accessible online as a Zotero group library at https://www.zotero.org/groups/2132650/gesture_elicitation_studies.2020SVSantiago Villarreal et al.Hand Gesture RecognitionFull-Body Interaction & Embodied InputPrototyping & User TestingDIS
Crowdlicit: A System for Conducting Distributed End-User Elicitation and Identification StudiesEnd-user elicitation studies are a popular design method. Currently, such studies are usually confined to a lab, limiting the number and diversity of participants, and therefore the representativeness of their results. Furthermore, the quality of the results from such studies generally lacks any formal means of evaluation. In this paper, we address some of the limitations of elicitation studies through the creation of the Crowdlicit system along with the introduction of end-user identification studies, which are the reverse of elicitation studies. Crowdlicit is a new web-based system that enables researchers to conduct online and in-lab elicitation and identification studies. We used Crowdlicit to run a crowd-powered elicitation study based on Morris's "Web on the Wall" study (2012) with 78 participants, arriving at a set of symbols that included six new symbols different from Morris's. We evaluated the effectiveness of 49 symbols (43 from Morris and six from Crowdlicit) by conducting a crowd-powered identification study. We show that the Crowdlicit elicitation study resulted in a set of symbols that was significantly more identifiable than Morris's.2019AAAbdullah X. Ali et al.University of WashingtonCrowdsourcing Task Design & Quality ControlUser Research Methods (Interviews, Surveys, Observation)CHI
Cluster Touch: Improving Touch Accuracy on Smartphones for People with Motor and Situational ImpairmentsWe present Cluster Touch, a combined user-independent and user-specific touch offset model that improves the accuracy of touch input on smartphones for people with motor impairments, and for people experiencing situational impairments while walking. Cluster Touch combines touch examples from multiple users to create a shared user-independent touch model, which is then updated with touch examples provided by an individual user to make it user-specific. Owing to this combination, Cluster Touch allows people to quickly improve the accuracy of their smartphones by providing only 20 touch examples. In a user study with 12 people with motor impairments and 12 people without motor impairments, but who were walking, Cluster Touch improved touch accuracy by 14.65% for the former group and 6.81% for the latter group over the native touch sensor. Furthermore, in an offline analysis of existing mobile interfaces, Cluster Touch improved touch accuracy by 8.21% and 4.84% over the native touch sensor for the two user groups, respectively.2019MMMartez E. Mott et al.University of WashingtonMotor Impairment Assistive Input TechnologiesCHI
Crowdsourcing Similarity Judgments for Agreement Analysis in End-User Elicitation StudiesEnd-user elicitation studies are a popular design method, but their data require substantial time and effort to analyze. In this paper, we present Crowdsensus, a crowd-powered tool that enables researchers to efficiently analyze the results of elicitation studies using subjective human judgment and automatic clustering algorithms. In addition to our own analysis, we asked six expert researchers with experience running and analyzing elicitation studies to analyze an end-user elicitation dataset of 10 functions for operating a web-browser, each with 43 voice commands elicited from end-users for a total of 430 voice commands. We used Crowdsensus to gather similarity judgments of these same 430 commands from 410 online crowd workers. The crowd outperformed the experts by arriving at the same results for seven of eight functions and resolving a function where the experts failed to agree. Also, using Crowdsensus was about four times faster than using experts.2018AAAbdullah X. Ali et al.Crowdsourcing Task Design & Quality ControlUser Research Methods (Interviews, Surveys, Observation)Prototyping & User TestingUIST