Efficient Visual Appearance Optimization by Learning from Prior PreferencesAdjusting visual parameters such as brightness and contrast is common in our everyday experiences. Finding the optimal parameter setting is challenging due to the large search space and the lack of an explicit objective function, leaving users to rely solely on their implicit preferences. Prior work has explored Preferential Bayesian Optimization (PBO) to address this challenge, involving users to iteratively select preferred designs from candidate sets. However, PBO often requires many rounds of preference comparisons, making it more suitable for designers than everyday end-users. We propose Meta-PO, a novel method that integrates PBO with meta-learning to improve sample efficiency. Specifically, Meta-PO infers prior users' preferences and stores them as models, which are leveraged to intelligently suggest design candidates for the new users, enabling faster convergence and more personalized results. An experimental evaluation of our method for appearance design tasks on 2D and 3D content showed that participants achieved satisfactory appearance in 5.86 iterations using Meta-PO when participants shared similar goals with a population (e.g., tuning for a ``warm'' look) and in 8 iterations even generalizes across divergent goals (e.g., from ``vintage'', ``warm'', to ``holiday''). Meta-PO makes personalized visual optimization more applicable to end-users through a generalizable, more efficient optimization conditioned on preferences, with the potential to scale interface personalization more broadly.2025ZLZhipeng Li et al.Explainable AI (XAI)AI-Assisted Decision-Making & AutomationUIST
Redefining Affordance via Computational RationalityAffordances, a foundational concept in human-computer interaction and design, have traditionally been explained by direct-perception theories, which assume that individuals perceive action possibilities directly from the environment. However, these theories fall short of explaining how affordances are perceived, learned, refined, or misperceived, and how users choose between multiple affordances in dynamic contexts. This paper introduces a novel affordance theory grounded in Computational Rationality, positing that humans construct internal representations of the world based on bounded sensory inputs. Within these internal models, affordances are inferred through two core mechanisms: feature recognition and hypothetical motion trajectories. Our theory redefines affordance perception as a decision-making process, driven by two components: confidence (the perceived likelihood of successfully executing an action) and predicted utility (the expected value of the outcome). By balancing these factors, individuals make informed decisions about which actions to take. Our theory frames affordances perception as dynamic, continuously learned, and refined through reinforcement and feedback. We validate the theory via thought experiments and demonstrate its applicability across diverse types of affordances (e.g., physical, digital, social). Beyond clarifying and generalizing the understanding of affordances across contexts, our theory serves as a foundation for improving design communication and guiding the development of more adaptive and intuitive systems that evolve with user capabilities.2025YLYi-Chi Liao et al.Explainable AI (XAI)Privacy by Design & User ControlUser Research Methods (Interviews, Surveys, Observation)IUI
The Personality Dimensions GPT-3 Expresses During Human-Chatbot InteractionsKovacevic等人分析GPT-3在对话中表现的人格特征,揭示大型语言模型在社交交互中的行为模式与个性表达。2024NKNikola Kovacevic et al.Agent Personality & AnthropomorphismUbiComp
Chatbots With Attitude: Enhancing Chatbot Interactions Through Dynamic Personality InfusionEquipping chatbots with personality has the potential of transforming user interactions from mere transactions to engaging conversations, enhancing user satisfaction and experience. In this work, we introduce dynamic personality infusion, a novel intermediate stage between the chatbot and the user that adjusts the chatbot's response using a dedicated chatbot personality model and GPT-4 without altering the chatbot's semantic capabilities. To test the effectiveness of our method, we first collected human-chatbot conversations from 33 participants while they interacted with three LLM-based chatbots (GPT-3.5, Llama-2 13B, and Mistral 7B). Then, we conducted an online rating survey with 725 participants on the collected conversations. We analyze the impact of the personality infusion on the perceived trustworthiness of the chatbots and the suitability of different personality profiles for real-world chatbot use cases. Our work paves the way for dynamic, personalized chatbots, enhancing user trust and real-world applicability.2024NKNikola Kovacevic et al.Conversational ChatbotsAgent Personality & AnthropomorphismHuman-LLM CollaborationCUI
PressurePick: Muscle Tension Estimation for Guitar Players Using Unobtrusive Pressure SensingWhen learning to play an instrument, it is crucial for the learner's muscles to be in a relaxed state when practicing. Identifying, which parts of a song lead to increased muscle tension requires self-awareness during an already cognitively demanding task. In this work, we investigate unobtrusive pressure sensing for estimating muscle tension while practicing songs with the guitar. First, we collected data from twelve guitarists. Our apparatus consisted of three pressure sensors (one on each side of the guitar pick and one on the guitar neck) to determine the sensor that is most suitable for automatically estimating muscle tension. Second, we extracted features from the pressure time series that are indicative of muscle tension. Third, we present the hardware and software design of our PressurePick prototype, which is directly informed by the data collection and subsequent analysis.2023AFAndreas Rene Fender et al.Force Feedback & Pseudo-Haptic WeightBiosensors & Physiological MonitoringUIST
ViGather: Inclusive Virtual Conferencing with a Joint Experience Across Traditional Screen Devices and Mixed Reality HeadsetsTeleconferencing is poised to become one of the most frequent use cases of immersive platforms, since it supports high levels of presence and embodiment in collaborative settings. On desktop and mobile platforms, teleconferencing solutions are already among the most popular apps and accumulate significant usage time---not least due to the pandemic or as a desirable substitute for air travel or commuting. In this paper, we present ViGather, an immersive teleconferencing system that integrates users of all platform types into a joint experience via equal representation and a first-person experience. ViGather renders all participants as embodied avatars in one shared scene to establish co-presence and elicit natural behavior during collocated conversations, including nonverbal communication cues such as eye contact between participants as well as body language such as turning one's body to another person or using hand gestures to emphasize parts of a conversation during the virtual hangout. Since each user embodies an avatar and experiences situated meetings from an egocentric perspective no matter the device they join from, ViGather alleviates potential concerns about self-perception and appearance while mitigating potential `Zoom fatigue', as users' self-views are not shown. For participants in Mixed Reality, our system leverages the rich sensing and reconstruction capabilities of today's headsets. For users of tablets, laptops, or PCs, ViGather reconstructs the user's pose from the device's front-facing camera, estimates eye contact with other participants, and relates these non-verbal cues to immediate avatar animations in the shared scene. Our evaluation compared participants' behavior and impressions while videoconferencing in groups of four inside ViGather with those in Meta Horizon as a baseline for a social VR setting. Participants who participated on traditional screen devices (e.g., laptops and desktops) using ViGather reported a significantly higher sense of physical, spatial, and self-presence than when using Horizon, while all perceived similar levels of active social presence when using Virtual Reality headsets. Our follow-up study confirmed the importance of representing users on traditional screen devices as reconstructed avatars for perceiving self-presence.2023HQHuajian Qiu et al.Social & Collaborative VRMixed Reality WorkspacesIdentity & Avatars in XRMobileHCI
Reality Rifts: Wonder-ful Interfaces by Disrupting Perceptual CausalityReality Rifts are interfaces between the physical and the virtual reality, where incoherent observations of physical behavior lead users to imagine comprehensive and plausible end-to-end dynamics. Reality Rifts emerge in interactive physical systems that lack one or more components that are central to their operation, yet where the physical end-to-end interaction persists with plausible outcomes. Even in the presence of a Reality Rift, users can still interact with a system—much like they would with the unaltered and complete counterpart—leading them to implicitly infer the existence and imagine the behavior of the lacking components from observable phenomena and outcomes. Therefore, dynamic systems with Reality Rifts trigger doubt, curiosity, and rumination—a sense of wonder that users experience when observing a Reality Rift due to their innate curiosity. In this paper, we explore how interactive systems can elicit and guide the user's imagination by integrating Reality Rifts. We outline the design process for opening a Reality Rift in interactive physical systems, describe the resulting design space, and explore it through six characteristic prototypes. To understand to what extent and with which qualities these prototypes indeed induce a sense of wonder during an interaction, we evaluated \projectName\ in the form of a field deployment with 50 participants. We discuss participants' behavior and derive factors for the implementation of future wonder-ful experiences.2023LCLung-Pan Cheng et al.National Taiwan UniversityDesign FictionDigital Art Installations & Interactive PerformanceCHI
InfinitePaint: Painting in Virtual Reality with Passive Haptics Using Wet Brushes and a Physical Proxy CanvasDigital painting interfaces require an input fidelity that preserves the artistic expression of the user. Drawing tablets allow for precise and low-latency sensing of pen motions and other parameters like pressure to convert them to fully digitized strokes. A drawback is that those interfaces are rigid. While soft brushes can be simulated in software, the haptic sensation of the rigid pen input device is different compared to using a soft wet brush on paper. We present InfinitePaint, a system that supports digital painting in Virtual Reality on real paper with a real wet brush. We use special paper that turns black wherever it comes into contact with water and turns blank again upon drying. A single camera captures those temporary strokes and digitizes them while applying properties like color or other digital effects. We tested our system with artists and compared the subjective experience with a drawing tablet.2023AFAndreas Rene Fender et al.ETH ZürichHaptic WearablesDigital Art Installations & Interactive PerformanceCHI
HOOV: Hand Out-Of-View Tracking for Proprioceptive Interaction Using Inertial SensingCurrent Virtual Reality systems are designed for interaction under visual control. Using built-in cameras, headsets track the user's hands or hand-held controllers while they are inside the field of view. Current systems thus ignore the user's interaction with off-screen content---virtual objects that the user could quickly access through proprioception without requiring laborious head motions to bring them into focus. In this paper, we present HOOV, a wrist-worn sensing method that allows VR users to interact with objects outside their field of view. Based on the signals of a single wrist-worn inertial sensor, HOOV continuously estimates the user's hand position in 3-space to complement the headset's tracking as the hands leave the tracking range. Our novel data-driven method predicts hand positions and trajectories from just the continuous estimation of hand orientation, which by itself is stable based solely on inertial observations. Our inertial sensing simultaneously detects finger pinching to register off-screen selection events, confirms them using a haptic actuator inside our wrist device, and thus allows users to select, grab, and drop virtual content. We compared HOOV's performance with a camera-based optical motion capture system in two folds. In the first evaluation, participants interacted based on tracking information from the motion capture system to assess the accuracy of their proprioceptive input, whereas in the second, they interacted based on HOOV's real-time estimations. We found that HOOV's target-agnostic estimations had a mean tracking error of 7.7 cm, which allowed participants to reliably access virtual objects around their body without first bringing them into focus. We demonstrate several applications that leverage the larger input space HOOV opens up for quick proprioceptive interaction, and conclude by discussing the potential of our technique.2023PSPaul Streli et al.ETH ZürichIn-Vehicle Haptic, Audio & Multimodal FeedbackFoot & Wrist InteractionCHI
DeltaPen: A Device with Integrated High-Precision Translation and Rotation Sensing on Passive SurfacesWe present DeltaPen, a pen device that operates on passive surfaces without the need for external tracking systems or active sensing surfaces. DeltaPen integrates two adjacent lens-less optical flow sensors at its tip, from which it reconstructs accurate directional motion as well as yaw rotation. DeltaPen also supports tilt interaction using a built-in inertial sensor. A pressure sensor and high-fidelity haptic actuator complements our pen device while retaining a compact form factor that supports mobile use on uninstrumented surfaces. We present a processing pipeline that reliably extracts fine-grained pen translations and rotations from the two optical flow sensors. To asses the accuracy of our translation and angle estimation pipeline, we conducted a technical evaluation in which we compared our approach with ground-truth measurements of participants' pen movements during typical pen interactions. We conclude with several example applications that leverage our device's capabilities. Taken together, we demonstrate novel input dimensions with DeltaPen that have so far only existed in systems that require active sensing surfaces or external tracking.2022GLGuy Lüthi et al.Shape-Changing Interfaces & Soft Robotic MaterialsPrototyping & User TestingUIST
Affective State Prediction from Smartphone Touch and Sensor Data in the WildKnowledge of users' affective states can improve their interaction with smartphones by providing more personalized experiences (e.g., search results and news articles). We present an affective state classification model based on data gathered on smartphones in real-world environments. From touch events during keystrokes and the signals from the inertial sensors, we extracted two-dimensional heat maps as input into a convolutional neural network to predict the affective states of smartphone users. For evaluation, we conducted a data collection in the wild with 82 participants over 10 weeks. Our model accurately predicts three levels (low, medium, high) of valence (AUC up to 0.83), arousal (AUC up to 0.85), and dominance (AUC up to 0.84). We also show that using the inertial sensor data alone, our model achieves a similar performance (AUC up to 0.83), making our approach less privacy-invasive. By personalizing our model to the user, we show that performance increases by an additional 0.07 AUC.2022RWRafael Wampfler et al.ETH ZurichHuman Pose & Activity RecognitionMultilingual & Cross-Cultural Voice InteractionBiosensors & Physiological MonitoringCHI
TapType: Ten-finger text entry on everyday surfaces via Bayesian inferenceDespite the advent of touchscreens, typing on physical keyboards remains most efficient for entering text, because users can leverage all fingers across a full-size keyboard for convenient typing. As users increasingly type on the go, text input on mobile and wearable devices has had to compromise on full-size typing. In this paper, we present TapType, a mobile text entry system for full-size typing on passive surfaces—without an actual keyboard. From the inertial sensors inside a band on either wrist, TapType decodes and relates surface taps to a traditional QWERTY keyboard layout. The key novelty of our method is to predict the most likely character sequences by fusing the finger probabilities from our Bayesian neural network classifier with the characters' prior probabilities from an n-gram language model. In our online evaluation, participants on average typed 19 words per minute with a character error rate of 0.6 % after 30 minutes of training. Expert typists thereby consistently achieved more than 25 WPM at a similar error rate. We demonstrate applications of TapType in mobile use around smartphones and tablets, as a complement to interaction in situated Mixed Reality outside visual control, and as an eyes-free mobile text input method using an audio feedback-only interface.2022PSPaul Streli et al.ETH ZürichHand Gesture RecognitionFoot & Wrist InteractionContext-Aware ComputingCHI
Causality-preserving Asynchronous RealityMixed Reality is gaining interest as a platform for collaboration and focused work to a point where it may supersede current office settings in future workplaces. At the same time, we expect that interaction with physical objects and face-to-face communication will remain crucial for future work environments, which is a particular challenge in fully immersive Virtual Reality. In this work, we reconcile those requirements through a user's individual Asynchronous Reality, which enables seamless physical interaction across time. When a user is unavailable, e.g., focused on a task or in a call, our approach captures co-located or remote physical events in real-time, constructs a causality graph of co-dependent events, and lets immersed users revisit them at a suitable time in a causally accurate way. Enabled by our system AsyncReality, we present a workplace scenario that includes walk-in interruptions during a person's focused work, physical deliveries, and transient spoken messages. We then generalize our approach to a use-case agnostic concept and system architecture. We conclude by discussing the implications of an Asynchronous Reality for future offices.2022AFAndreas Rene Fender et al.ETH ZürichMixed Reality WorkspacesImmersion & Presence ResearchContext-Aware ComputingCHI
AlgoSolve: Supporting Subgoal Learning in Algorithmic Problem-Solving with Learnersourced MicrotasksDesigning solution plans before writing code is critical for successful algorithmic problem-solving. Novices, however, often plan on-the-fly during implementation, resulting in unsuccessful problem-solving due to lack of mental organization of the solution. Research shows that subgoal learning helps learners develop more complete solution plans by enhancing their understanding of the high-level solution structure. However, expert-created materials such as subgoal labels are necessary to provide learning benefits from subgoal learning, which are a scarce resource in self-learning due to limited availability and high cost. We propose a learnersourcing workflow that collects high-quality subgoal labels from learners by helping them improve their label quality. We implemented the workflow into AlgoSolve, a prototype interface that supports subgoal learning for algorithmic problems. A between-subjects study with 63 problem-solving novices revealed that AlgoSolve helped learners create higher-quality labels and more complete solution plans, compared to a baseline method known to be effective in subgoal learning.2022KCKabdo Choi et al.KAISTProgramming Education & Computational ThinkingIntelligent Tutoring Systems & Learning AnalyticsCHI
CapContact: Super-resolution Contact Areas from Capacitive TouchscreensTouch input is dominantly detected using mutual-capacitance sensing, which measures the proximity of close-by objects that change the electric field between the sensor lines. The exponential drop-off in intensities with growing distance enables software to detect touch events, but does not reveal true contact areas. In this paper, we introduce CapContact, a novel method to precisely infer the contact area between the user's finger and the surface from a single capacitive image. At 8x super-resolution, our convolutional neural network generates refined touch masks from 16-bit capacitive images as input, which can even discriminate adjacent touches that are not distinguishable with existing methods. We trained and evaluated our method using supervised learning on data from 10 participants who performed touch gestures. Our capture apparatus integrates optical touch sensing to obtain ground-truth contact through high-resolution frustrated total internal reflection. We compare our method with a baseline using bicubic upsampling as well as the ground truth from FTIR images. We separately evaluate our method's performance in discriminating adjacent touches. CapContact successfully separated closely adjacent touch contacts in 494 of 570 cases (87%) compared to the baseline's 43 of 570 cases (8%). Importantly, we demonstrate that our method accurately performs even at half of the sensing resolution at twice the grid-line pitch across the same surface area, challenging the current industry-wide standard of a ~4mm sensing pitch. We conclude this paper with implications for capacitive touch sensing in general and for touch-input accuracy in particular.2021PSPaul Streli et al.ETH ZürichHand Gesture RecognitionCHI
SurfaceFleet: Exploring Distributed Interactions Unbounded from Device, Application, User, and TimeKnowledge work increasingly spans multiple computing surfaces. Yet in status quo user experiences, content as well as tools, behaviors, and workflows are largely bound to the current device—running the current application, for the current user, and at the current moment in time. SurfaceFleet is a system and toolkit that uses resilient distributed programming techniques to explore cross-device interactions that are unbounded in these four dimensions of device, application, user, and time. As a reference implementation, we describe an interface built using Surface Fleet that employs lightweight, semi-transparent UI elements known as Applets. Applets appear always-on-top of the operating system, application windows, and (conceptually) above the device itself. But all connections and synchronized data are virtualized and made resilient through the cloud. For example, a sharing Applet known as a Portfolio allows a user to drag and drop unbound Interaction Promises into a document. Such promises can then be fulfilled with content asynchronously, at a later time (or multiple times), from another device, and by the same or a different user.2020FBFrederik Brudy et al.Distributed Team CollaborationKnowledge Worker Tools & WorkflowsUIST
Tilt-Responsive Techniques for Digital Drawing BoardsDrawing boards offer a self-stable work surface that is continuously adjustable. On digital displays, such as the Microsoft Surface Studio, these properties open up a class of techniques that sense and respond to tilt adjustments. Each display posture—whether angled high, low, or somewhere in-between—affords some activities, but not others. Because what is appropriate also depends on the application and task, we explore a range of app-specific transitions between reading vs. writing (annotation), public vs. personal, shared person-space vs. task-space, and other nuances of input and feedback, contingent on display angle. Continuous responses provide interactive transitions tailored to each use-case. We show how a variety of knowledge work scenarios can use sensed display adjustments to drive context-appropriate transitions, as well as technical software details of how to best realize these concepts. A preliminary remote user study suggests that techniques must balance effort required to adjust tilt, versus the potential benefits of a sensed transition.2020HRHugo Romat et al.Knowledge Worker Tools & WorkflowsNotification & Interruption ManagementUIST
Omni: Volumetric Sensing and Actuation of Passive Magnetic Tools for Dynamic Haptic FeedbackWe present Omni, a self-contained 3D haptic feedback system that is capable of sensing and actuating an untethered, passive tool containing only a small embedded permanent magnet. Omni enriches AR, VR and desktop applications by providing an active haptic experience using a simple apparatus centered around an electromagnetic base. The spatial haptic capabilities of Omni are enabled by a novel gradient-based method to reconstruct the 3D position of the permanent magnet in midair using the measurements from eight off-the-shelf hall sensors that are integrated into the base. Omni’s 3 DoF spherical electromagnet simultaneously exerts dynamic and precise radial and tangential forces in a volumetric space around the device. Since our system is fully integrated, contains no moving parts and requires no external tracking, it is easy and affordable to fabricate. We describe Omni’s hardware implementation, our 3D reconstruction algorithm, and evaluate the tracking and actuation performance in depth. Finally, we demonstrate its capabilities via a set of interactive usage scenarios.2020TLThomas Langerak et al.Force Feedback & Pseudo-Haptic WeightShape-Changing Interfaces & Soft Robotic MaterialsFull-Body Interaction & Embodied InputUIST
A Rapid Tapping Task on Commodity Smartphones to Assess Motor FatigabilityFatigue is a common debilitating symptom of many autoimmune diseases, including multiple sclerosis. It negatively impacts patients' every-day life and productivity. Despite its prevalence, fatigue is still poorly understood. Its subjective nature makes quantification challenging and it is mainly assessed by questionnaires, which capture the magnitude of fatigue insufficiently. Motor fatigability, the objective decline of performance during a motor task, is an underrated aspect in this regard. Currently, motor fatigability is assessed using a handgrip dynamometer. This approach has been proven valid and accurate but requires special equipment and trained personnel. We propose a technique to objectively quantify motor fatigability using a commodity smartphone. The method comprises a simple exertion task requiring rapid alternating tapping. Our study with 20 multiple sclerosis patients and 35 healthy participants showed a correlation of rho = 0.8 with the baseline handgrip method. This smartphone-based approach is a first step towards ubiquitous, more frequent, and remote monitoring of fatigability and disease progression.2020LBLiliana Barrios et al.ETH ZurichMotor Impairment Assistive Input TechnologiesCHI
GazeConduits: Calibration-Free Cross-Device Collaboration through Gaze and TouchWe present GazeConduits, a calibration-free ad-hoc mobile interaction concept that enables users to collaboratively interact with tablets, other users, and content in a cross-device setting using gaze and touch input. GazeConduits leverages recently introduced smartphone capabilities to detect facial features and estimate users' gaze directions. To join a collaborative setting, users place one or more tablets onto a shared table and position their phone in the center, which then tracks users present as well as their gaze direction to determine the tablets they look at. We present a series of techniques using GazeConduits for collaborative interaction across mobile devices for content selection and manipulation. Our evaluation with 20 simultaneous tablets on a table shows that GazeConduits can reliably identify which tablet or collaborator a user is looking at.2020SVSimon Voelker et al.RWTH Aachen UniversityEye Tracking & Gaze InteractionKnowledge Worker Tools & WorkflowsCHI