DxHF: Providing High-Quality Human Feedback for LLM Alignment with Interactive DecompositionHuman preferences are widely used to align large language models (LLMs) through methods such as reinforcement learning from human feedback (RLHF). However, the current user interfaces require annotators to compare text paragraphs, which is cognitively challenging when the texts are long or unfamiliar. This paper contributes by studying the decomposition principle as an approach to improving the quality of human feedback for LLM alignment. This approach breaks down the text into individual claims instead of directly comparing two long-form text responses. Based on the principle, we build a novel user interface DxHF. It enhances the comparison process by showing decomposed claims, visually encoding the relevance of claims to the conversation and linking similar claims. This allows users to skim through key information and identify differences for better and quicker judgment. Our technical evaluation shows evidence that decomposition generally improves feedback accuracy regarding the ground truth, particularly for users with uncertainty. A crowdsourcing study with 160 participants indicates that using DxHF improves feedback accuracy by an average of 5%, although it increases the average feedback time by 18 seconds. Notably, accuracy is significantly higher in situations where users have less certainty. The finding of the study highlights the potential of HCI as an effective method for improving human-AI alignment.2025DSDanqing Shi et al.Human-LLM CollaborationExplainable AI (XAI)UIST
Simulating Errors in Touchscreen TypingEmpirical evidence shows that typing on touchscreen devices is prone to errors and that correcting them poses a major detriment to users’ performance. Design of text entry systems that better serve users, across their broad capability range, necessitates understanding the cognitive mechanisms that underpin these errors. However, prior models of typing cover only motor slips. The paper reports on extending the scope of computational modeling of typing to cover the cognitive mechanisms behind the three main types of error: slips (inaccurate execution), lapses (forgetting), and mistakes (incorrect knowledge). Given a phrase, a keyboard, and user parameters, Typoist simulates eye and finger movements while making human-like insertion, omission, substitution, and transposition errors. Its main technical contribution is the formulation of a supervisory control problem wherein the controller allocates cognitive resources to detect and fix errors generated by the various mechanisms. The model generates predictions of typing performance that can inform design, for better text entry systems.2025DSDanqing Shi et al.Aalto UniversityForce Feedback & Pseudo-Haptic WeightComputational Methods in HCICHI
No Evidence for LLMs Being Useful in Problem ReframingProblem reframing is a designerly activity wherein alternative perspectives are created to recast what a stated design problem is about. Generating alternative problem frames is challenging because it requires devising novel and useful perspectives that fit the given problem context. Large language models (LLMs) could assist this activity via their generative capability. However, it is not clear whether they can help designers produce high-quality frames. Therefore, we asked if there are benefits to working with LLMs. To this end, we compared three ways of using LLMs (N=280): 1) free-form, 2) direct generation, and 3) a structured approach informed by a theory of reframing. We found that using LLMs does not help improve the quality of problem frames. In fact, it increases the competence gap between experienced and inexperienced designers. Also, inexperienced ones perceived lower agency when working with LLMs. We conclude that there is no benefit to using LLMs in problem reframing and discuss possible factors for this lack of effect.2025JSJoongi Shin et al.Aalto UniversityHuman-LLM CollaborationAI-Assisted Creative WritingCHI
Chartist: Task-driven Eye Movement Control for Chart ReadingTo design data visualizations that are easy to comprehend, we need to understand how people with different interests read them. Computational models of predicting scanpaths on charts could complement empirical studies by offering estimates of user performance inexpensively; however, previous models have been limited to gaze patterns and overlooked the effects of tasks. Here, we contribute Chartist, a computational model that simulates how users move their eyes to extract information from the chart in order to perform analysis tasks, including value retrieval, filtering, and finding extremes. The novel contribution lies in a two-level hierarchical control architecture. At the high level, the model uses LLMs to comprehend the information gained so far and applies this representation to select a goal for the lower-level controllers, which, in turn, move the eyes in accordance with a sampling policy learned via reinforcement learning. The model is capable of predicting human-like task-driven scanpaths across various tasks. It can be applied in fields such as explainable AI, visualization design evaluation, and optimization. While it displays limitations in terms of generalizability and accuracy, it takes modeling in a promising direction, toward understanding human behaviors in interacting with charts.2025DSDanqing Shi et al.Aalto UniversityInteractive Data VisualizationComputational Methods in HCICHI
DesignQuizzer: A Community-Powered Conversational Agent for Learning Visual DesignOnline design communities, where members exchange free-form views on others� designs, offer a space for beginners to learn visual design. However, the content of these communities is often unorganized for learners, containing many redundancies and irrelevant comments. In this paper, we propose a computational approach for leveraging online design communities to run a conversational agent that assists informal learning of visual elements (e.g., color and space). Our method extracts critiques, suggestions, and rationales on visual elements from comments. We present DesignQuizzer, which asks questions about visual design in UI examples and provides structured comment summaries. Two user studies demonstrate the engagement and usefulness of DesignQuizzer compared with the baseline (reading reddit.com/r/UI_design). We also showcase how effectively novices can apply what they learn with DesignQuizzer in a design critique task and a visual design task. We discuss how to use our approach with other communities and offer design considerations for community-powered learning support tools.2024ZPZhenhui Peng et al.Session 3g: Collaborative Technologies: Empathy, Attribution, and RiskCSCW
SIM2VR: Towards Automated Biomechanical Testing in VRAutomated biomechanical testing has great potential for the development of VR applications, as initial insights into user behaviour can be gained in silico early in the design process. In particular, it allows prediction of user movements and ergonomic variables, such as fatigue, prior to conducting user studies. However, there is a fundamental disconnect between simulators hosting state-of-the-art biomechanical user models and simulators used to develop and run VR applications. Existing user simulators often struggle to capture the intricacies of real-world VR applications, reducing ecological validity of user predictions. In this paper, we introduce SIM2VR, a system that aligns user simulation with a given VR application by establishing a continuous closed loop between the two processes. This, for the first time, enables training simulated users directly in the same VR application that real users interact with. We demonstrate that SIM2VR can predict differences in user performance, ergonomics and strategies in a fast-paced, dynamic arcade game. In order to expand the scope of automated biomechanical testing beyond simple visuomotor tasks, advances in cognitive models and reward function design will be needed.2024FFFlorian Fischer et al.Human Pose & Activity RecognitionVR Medical Training & RehabilitationUIST
EyeFormer: Predicting Personalized Scanpaths with Transformer-Guided Reinforcement LearningFrom a visual-perception perspective, modern graphical user interfaces (GUIs) comprise a complex graphics-rich two-dimensional visuospatial arrangement of text, images, and interactive objects such as buttons and menus. While existing models can accurately predict regions and objects that are likely to attract attention ``on average'', no scanpath model has been capable of predicting scanpaths for an individual. To close this gap, we introduce EyeFormer, which utilizes a Transformer architecture as a policy network to guide a deep reinforcement learning algorithm that predicts gaze locations. Our model offers the unique capability of producing personalized predictions when given a few user scanpath samples. It can predict full scanpath information, including fixation positions and durations, across individuals and various stimulus types. Additionally, we demonstrate applications in GUI layout optimization driven by our model.2024YJYue Jiang et al.Eye Tracking & Gaze InteractionExplainable AI (XAI)Participatory DesignUIST
Understanding Human-AI Workflows for Generating PersonasOne barrier to deeper adoption of user-research methods is the amount of labor required to create high-quality representations of collected data. Trained user researchers need to analyze datasets and produce informative summaries pertaining to the original data. While Large Language Models (LLMs) could assist in generating summaries, they are known to hallucinate and produce biased responses. In this paper, we study human--AI workflows that differently delegate subtasks in user research between human experts and LLMs. Studying persona generation as our case, we found that LLMs are not good at capturing key characteristics of user data on their own. Better results are achieved when we leverage human skill in grouping user data by their key characteristics and exploit LLMs for summarizing pre-grouped data into personas. Personas generated via this collaborative approach can be more representative and empathy-evoking than ones generated by human experts or LLMs alone. We also found that LLMs could mimic generated personas and enable interaction with personas, thereby helping user researchers empathize with them. We conclude that LLMs, by facilitating the analysis of user data, may promote widespread application of qualitative methods in user research.2024JSJoongi Shin et al.Human-LLM CollaborationUser Research Methods (Interviews, Surveys, Observation)DIS
Heads-Up Multitasker: Simulating Attention Switching On Optical Head-Mounted DisplaysOptical Head-Mounted Displays (OHMDs) allow users to read digital content while walking. A better understanding of how users allocate attention between these two tasks is crucial for improving OHMD interfaces. This paper introduces a computational model for simulating users' attention switches between reading and walking. We model users' decision to deploy visual attention as a hierarchical reinforcement learning problem, wherein a supervisory controller optimizes attention allocation while considering both reading activity and walking safety. Our model simulates the control of eye movements and locomotion as an adaptation to the given task priority, design of digital content, and walking speed. The model replicates key multitasking behaviors during OHMD reading while walking, including attention switches, changes in reading and walking speeds, and reading resumptions.2024YBYunpeng Bai et al.National University of SingaporeHead-Up Display (HUD) & Advanced Driver Assistance Systems (ADAS)Eye Tracking & Gaze InteractionCHI
CRTypist: Simulating Touchscreen Typing Behavior via Computational RationalityTouchscreen typing requires coordinating the fingers and visual attention for button-pressing, proofreading, and error correction. Computational models need to account for the associated fast pace, coordination issues, and closed-loop nature of this control problem, which is further complicated by the immense variety of keyboards and users. The paper introduces CRTypist, which generates human-like typing behavior. Its key feature is a reformulation of the supervisory control problem, with the visual attention and motor system being controlled with reference to a working memory representation tracking the text typed thus far. Movement policy is assumed to asymptotically approach optimal performance in line with cognitive and design-related bounds. This flexible model works directly from pixels, without requiring hand-crafted feature engineering for keyboards. It aligns with human data in terms of movements and performance, covers individual differences, and can generalize to diverse keyboard designs. Though limited to skilled typists, the model generates useful estimates of the typing performance achievable under various conditions.2024DSDanqing Shi et al.Aalto UniversityKnowledge Worker Tools & WorkflowsComputational Methods in HCICHI
Supporting Task Switching with Reinforcement LearningAttention management systems aim to mitigate the negative effects of multitasking. However, sophisticated real-time attention management is yet to be developed. We present a novel concept for attention management with reinforcement learning that automatically switches tasks. The system was trained with a user model based on principles of computational rationality. Due to this user model, the system derives a policy that schedules task switches by considering human constraints such as visual limitations and reaction times. We evaluated its capabilities in a challenging dual-task balancing game. Our results confirm our main hypothesis that an attention management system based on reinforcement learning can significantly improve human performance, compared to humans’ self-determined interruption strategy. The system raised the frequency and difficulty of task switches compared to the users while still yielding a lower subjective workload. We conclude by arguing that the concept can be applied to a great variety of multitasking settings.2024ALAlexander Lingler et al.University of Applied Sciences Upper AustriaPrivacy by Design & User ControlNotification & Interruption ManagementCHI
Real-time 3D Target Inference via Biomechanical SimulationSelecting a target in a 3D environment is often challenging, especially with small/distant targets or when sensor noise is high. To facilitate selection, target-inference methods must be accurate, fast, and account for noise and motor variability. However, traditional data-free approaches fall short in accuracy since they ignore variability. While data-driven solutions achieve higher accuracy, they rely on extensive human datasets so prove costly, time-consuming, and transfer poorly. In this paper, we propose a novel approach that leverages biomechanical simulation to produce synthetic motion data, capturing a variety of movement-related factors, such as limb configurations and motor noise. Then, an inference model is trained with only the simulated data. Our simulation-based approach improves transfer and lowers cost; variety-rich data can be produced in large quantities for different scenarios. We empirically demonstrate that our method matches the accuracy of human-data-driven approaches using data from seven users. When deployed, the method accurately infers intended targets in challenging 3D pointing conditions within 5–10 milliseconds, reducing users' target-selection error by 71% and completion time by 35%.2024HMHee-Seung Moon et al.Aalto UniversityFull-Body Interaction & Embodied InputHuman Pose & Activity RecognitionComputational Methods in HCICHI
Palette, Purpose, Prototype: The Three Ps of Color Design and How Designers Navigate ThemThis paper contributes to understanding of a fundamental process in design: choosing colors. While much has been written on color theory and about general design processes, understanding of designers’ actual color-design practice and experiences remains patchy. To address this gap, this paper presents qualitative findings from an interview-based study with 12 designers and, on their basis, a conceptual framework of three interlinked color design spaces: purpose, palette, and prototype. Respectively, these represent a meaning the colors should deliver, a proposed set of colors fitting this purpose, and a possible allocation of these colors to a candidate design. Through a detailed report on how designers iteratively navigate these spaces, the findings offer a rich account of color-design practice and point to possible design benefits from computational toolsthat integrate considerations of all three.2024LHLena Hegemann et al.Aalto University360° Video & Panoramic ContentGraphic Design & Typography ToolsCHI
Graph4GUI: Graph Neural Networks for Representing Graphical User Interfaces Present-day graphical user interfaces (GUIs) exhibit diverse arrangements of text, graphics, and interactive elements such as buttons and menus, but representations of GUIs have not kept up. They do not encapsulate both semantic and visuo-spatial relationships among elements. To seize machine learning's potential for GUIs more efficiently, Graph4GUI exploits graph neural networks to capture individual elements' properties and their semantic-visuo-spatial constraints in a layout. The learned representation demonstrated its effectiveness in multiple tasks, especially generating designs in a challenging GUI autocompletion task, which involved predicting the positions of remaining unplaced elements in a partially completed GUI. The new model's suggestions showed alignment and visual appeal superior to the baseline method and received higher subjective ratings for preference. Furthermore, we demonstrate the practical benefits and efficiency advantages designers perceive when utilizing our model as an autocompletion plug-in.2024YJYue Jiang et al.Aalto University360° Video & Panoramic ContentComputational Methods in HCICHI
Amortized Inference with User SimulationsThere have been significant advances in simulation models predicting human behavior across various interactive tasks. One issue remains, however: identifying the parameter values that best describe an individual user. These parameters often express personal cognitive and physiological characteristics, and inferring their exact values has significant effects on individual-level predictions. Still, the high complexity of simulation models usually causes parameter inference to consume prohibitively large amounts of time, as much as days per user. We investigated amortized inference for its potential to reduce inference time dramatically, to mere tens of milliseconds. Its principle is to pre-train a neural proxy model for probabilistic inference, using synthetic data simulated from a range of parameter combinations. From examining the efficiency and prediction performance of amortized inference in three challenging cases that involve real-world data (menu search, point-and-click, and touchscreen typing), the paper demonstrates that an amortized inference approach permits analyzing large-scale datasets by means of simulation models. It also addresses emerging opportunities and challenges in applying amortized inference in HCI.2023HMHee-Seung Moon et al.Yonsei University, Aalto UniversityChronic Disease Self-Management (Diabetes, Hypertension, etc.)Knowledge Worker Tools & WorkflowsCHI
UEyes: Understanding Visual Saliency across User Interface TypesWhile user interfaces (UIs) display elements such as images and text in a grid-based layout, UI types differ significantly in the number of elements and how they are displayed. For example, webpage designs rely heavily on images and text, whereas desktop UIs tend to feature numerous small images. To examine how such differences affect the way users look at UIs, we collected and analyzed a large eye-tracking-based dataset, \textit{UEyes} (62 participants and 1,980 UI screenshots), covering four major UI types: webpage, desktop UI, mobile UI, and poster. We analyze its differences in biases related to such factors as color, location, and gaze direction. We also compare state-of-the-art predictive models and propose improvements for better capturing typical tendencies across UI types. Both the dataset and the models are publicly available.2023YJYanqi Jiang et al.Aalto UniversityEye Tracking & Gaze InteractionVisualization Perception & CognitionCHI
Modeling Touch-based Menu Selection Performance of Blind Users via Reinforcement LearningAlthough menu selection has been extensively studied in HCI, most existing studies have focused on sighted users, leaving blind users' menu selection under-studied. In this paper, we propose a computational model that can simulate blind users’ menu selection performance and strategies, including the way they use techniques like swiping, gliding, and direct touch. We assume that selection behavior emerges as an adaptation to the user's memory of item positions based on experience and feedback from the screen reader. A key aspect of our model is a model of long-term memory, predicting how a user recalls and forgets item position based on previous menu selections. We compare simulation results predicted by our model against data obtained in an empirical study with ten blind users. The model correctly simulated the effect of the menu length and menu arrangement on selection time, the action composition, and the menu selection strategy of the users.2023ZLZhi Li et al.Stony Brook UniversityVisual Impairment Technologies (Screen Readers, Tactile Graphics, Braille)CHI
AUIT – the Adaptive User Interfaces Toolkit for Designing XR ApplicationsAdaptive user interfaces can improve experiences in Extended Reality (XR) applications by adapting interface elements according to the user's context. Although extensive work explores different adaptation policies, XR creators often struggle with their implementation, which involves laborious manual scripting. The few available tools are underdeveloped for realistic XR settings where it is often necessary to consider conflicting aspects that affect an adaptation. We fill this gap by presenting AUIT, a toolkit that facilitates the design of optimization-based adaptation policies. AUIT allows creators to flexibly combine policies that address common objectives in XR applications, such as element reachability, visibility, and consistency. Instead of using rules or scripts, specifying adaptation policies via adaptation objectives simplifies the design process and enables creative exploration of adaptations. After creators decide which adaptation objectives to use, a multi-objective solver finds appropriate adaptations in real-time. A study showed that AUIT allowed creators of XR applications to quickly and easily create high-quality adaptations.2022JBJoão Marcelo Evangelista Belo et al.AR Navigation & Context AwarenessMixed Reality WorkspacesUIST
Chatbots Facilitating Consensus-building in Asynchronous Co-DesignConsensus-building is an essential process for the success of co-design projects. To build consensus, stakeholders need to discuss conflicting needs and viewpoints, converge their ideas toward shared interests, and grow their willingness to commit to group decisions. However, managing group discussions is challenging in large co-design projects with multiple stakeholders. In this paper, we investigate the interaction design of a chatbot that can mediate consensus-building conversationally. By interacting with individual stakeholders, the chatbot collects ideas for satisfying conflicting needs and engages stakeholders to consider others' viewpoints, without having stakeholders directly interact with each other. Results from an empirical study in an educational setting (N = 12) suggest that the approach can increase stakeholders' commitment to group decisions and maintain the effect even on the group decisions that conflict with personal interests. We conclude that chatbots can facilitate consensus-building in small-to-medium-sized projects, but more work is needed to scale up to larger projects.2022JSJoongi Shin et al.Conversational ChatbotsUIST
Breathing Life Into Biomechanical User ModelsForward biomechanical simulation in HCI holds great promise as a tool for evaluation, design, and engineering of user interfaces. Although reinforcement learning (RL) has been used to simulate biomechanics in interaction, prior work has relied on unrealistic assumptions about the control problem involved, which limits the plausibility of emerging policies. These assumptions include direct torque actuation as opposed to muscle-based control; direct, privileged access to the external environment, instead of imperfect sensory observations; and lack of interaction with physical input devices. In this paper, we present a new approach for learning muscle-actuated control policies based on perceptual feedback in interaction tasks with physical input devices. This allows modelling of more realistic interaction tasks with cognitively plausible visuomotor control. We show that our simulated user model successfully learns a variety of tasks representing different interaction methods, and that the model exhibits characteristic movement regularities observed in studies of pointing. We provide an open-source implementation which can be extended with further biomechanical models, perception models, and interactive environments.2022AIAleksi Ikkala et al.Human Pose & Activity RecognitionComputational Methods in HCIUIST