PrivWeb: Unobtrusive and Content-aware Privacy Protection For Web AgentsWhile web agents gained popularity by automating web interactions, their requirement for interface access introduces privacy risks that are understudied, particularly from users' perspective. Through a formative study (N=15), we found that users frequently misunderstand agent data practices, and desire unobtrusive, transparent data management. To achieve this, we developed PrivWeb, a trusted add-on on web agents that utilizes a localized LLM to anonymize private information on interfaces based on user preferences. It employs a tiered delegation to balance automation and intrusiveness, using ambient notifications for low-sensitivity data and enforces a mandatory pause for high-sensitivity data. The user study (N=14) across travel, information retrieval, shopping, and entertainment tasks showed that PrivWeb enhances perceived privacy protection and trust compared to transparency-only baselines, without increasing cognitive load. Crucially, we identified user delegation strategies: they prefer to manually execute sensitive steps for high-sensitivity data, while granting agent access to low-sensitivity data.2026SZShuning Zhang et al.Tsinghua UniversityPrivacy by Design & User ControlPrivacy Perception & Decision-MakingHuman-LLM CollaborationCHI
VisGuardian: A Lightweight Group-based Visual Privacy Control Technique For Smart Glasses in Home EnvironmentsAlways-on sensing of AI applications on AR glasses makes traditional permission techniques inefficient for context-dependent private visual data within home environments. Home presents a challenging privacy context due to massive sensitive objects and the intimate nature of daily routines. We propose VisGuardian, a fine-grained content-based visual permission technique for AR glasses. VisGuardian features a group-based control mechanism that enables users to efficiently manage permissions for multiple private objects. VisGuardian detects objects using YOLO and adopts a pre-classified schema to group them. By selecting a single object, users can obscure groups of related objects based on criteria including privacy sensitivity, object category, or spatial proximity. A technical evaluation shows VisGuardian achieves mAP50 of 0.6704 with only 14.0 ms latency and a 1.7% increase in battery consumption per hour. Furthermore, a user study (N=24) comparing VisGuardian to slider-based and object-based baselines found it to be significantly faster for setting permissions and was preferred by users for its efficiency, effectiveness, and ease of use.2026SZShuning Zhang et al.Tsinghua UniversitySmart Home Privacy & SecurityPrivacy by Design & User ControlAR Navigation & Context AwarenessCHI
"Privacy across the boundary": Examining Perceived Privacy Risk Across Data Transmission and Sharing Ranges of Smart Home Personal AssistantsAs Smart Home Personal Assistants (SPAs) evolve into social agents, understanding user privacy necessitates interpersonal communication frameworks, such as Privacy Boundary Theory (PBT). To ground our investigation, our three-phase preliminary study (1) identified transmission and sharing ranges as key boundary-related risk factors, (2) categorized relevant SPA functions and data types, and (3) analyzed commercial practices, revealing widespread data sharing and non-transparent safeguards. A subsequent mixed-methods study (N=412 survey, N=40 interviews among the survey participants) assessed users' perceived privacy risks across data types, transmission ranges and sharing ranges. Results demonstrate a significant, non-linear escalation in perceived risk when data crosses two critical boundaries: the `public network' (transmission) and `third parties' (sharing). This boundary effect holds across data types and demographics. Furthermore, risk perception is modulated by data attributes, and contextual privacy calculus. Conversely, anonymization show limited efficacy especially for third-party sharing, a finding attributed to user distrust. These findings empirically ground PBT in SPA context and inform design of boundary-aware privacy protection.2026SZShuning Zhang et al.Tsinghua UniversityPrivacy by Design & User ControlSmart Home Privacy & SecurityCHI
Characterizing Unintended Consequences of GUI Agents For Web BrowsingThe integration of LLMs into GUI agents promises to revolutionize web browsing automation, yet the practical user experience remains challenging. This paper systematically characterizes user-reported issues with GUI agents by focusing on three dimensions: phenomena, influences, and user-centric mitigation. We adopted a two-phase method combining social media analysis (N=221 posts) and semi-structured interviews (N=21). Our findings reveal a taxonomy of complaints unique to GUI agents, including deficits in grounding abstract intent into concrete interface affordances, the inability to adapt to dynamic visual states, and the execution of erroneous actions. These lead to influences distinct from text-based hallucinations, ranging from task abandonment to security risks like uncontrolled file system access. In response, users are forced to employ ad-hoc mitigation strategies, including ecological sandboxing, and cursor shadowing to correct GUI agents behaviors. We contribute: (1) a comprehensive characterization of complaints specific to GUI agents interaction, (2) an analysis of how these phenomena degrade interaction integrity, and (3) design implications for creating consequence-aware agents.2026SZShuning Zhang et al.Tsinghua UniversityHuman-LLM CollaborationExplainable AI (XAI)Privacy by Design & User ControlCHI
Collab: Fostering Critical Identification of Deepfake Videos on Social Media via Synergistic AnnotationIdentifying deepfake videos on social media platforms is challenged by dynamic spatio-temporal artifacts and inadequate user tools. This hinders both critical viewing by users and scalable moderation on platforms. Here, we present Collab, a web plugin enabling users to collaboratively annotate deepfake videos. Collab integrates three key components: (i) an intuitive interface for spatio-temporal labeling where users provide confidence scores and rationales, facilitating detailed input even from non-experts, (ii) a novel confidence-weighted spatio-temporal Intersection-over-Union (IoU) algorithm to aggregate diverse user annotations into accurate aggregations, and (iii) a hierarchical demonstration strategy presenting aggregated results to guide attention toward contentious regions and foster critical evaluation. A seven-day online study (N=90), where participants annotated suspicious videos when viewing an online experimental platforms, compared Collab against two conditions without aggregation or demonstration respectively. Collab significantly improved identification accuracy and enhanced reflection compared to non-demonstration condition, while outperforming non-aggregation condition for its novelty and effectiveness.2026SZShuning Zhang et al.Tsinghua UniversityDeepfake & Synthetic Media DetectionContent Moderation & Platform GovernanceMisinformation & Fact-CheckingCHI
Roomify: Spatially-Grounded Style Transformation for Immersive Virtual EnvironmentsWe present Roomify, a spatially-grounded transformation system that generates themed virtual environments anchored to users' physical rooms while maintaining spatial structure and functional semantics. Current VR approaches face a fundamental trade-off: full immersion sacrifices spatial awareness, while passthrough solutions break presence. Roomify addresses this through spatially-grounded transformation—treating physical spaces as "spatial containers'' that preserve key functional and geometric properties of furniture while enabling radical stylistic changes. Our pipeline combines in-situ 3D scene understanding, AI-driven spatial reasoning, and style-aware generation to create personalized virtual environments grounded in physical reality. We introduce a cross-reality authoring tool enabling fine-grained user control through MR editing and VR preview workflows. Two user studies validate our approach: one with 18 VR users demonstrates a 63% improvement in presence over passthrough and 26% over fully virtual baselines while maintaining spatial awareness; another with 8 design professionals confirms the system's creative expressiveness (scene quality: 5.95/7; creativity support: 6.08/7) and professional workflow value across diverse environments.2026XWXueyang Wang et al.Tsinghua UniversitySocial & Collaborative VRMixed Reality WorkspacesImmersion & Presence ResearchCHI
A Scoping Review and Guidelines on Privacy Policy's Visualization from an HCI PerspectivePrivacy Policies are a cornerstone of informed consent, yet a persistent gap exists between their legal intent and practical efficacy. Despite decades of research proposing various visualizations, user comprehension remains low, and designs rarely see widespread adoption. To understand this landscape and chart a path forward, we synthesized 65 top-tier papers using a framework adapted from user-centered design lifecycles. Our analysis presented four findings of the field's evolution: (1) trade-off between information load and decision efficacy, which shows a shift from augmenting disclosures to cognitive load management, (2) co-evolutionary dynamic of design and automation, revealing that designs such as context-awareness drove automation needs, while LLM breakthroughs enable the semantic interpretation required to realize those designs, (3) tension between generality and specificity, highlighting the divergence between standardized solutions and the increasing necessity for specialized interaction in IoT and immersive environments, and (4) balancing stakeholder opinions, where visualization efficacy is constrained by the interplay of regulatory mandates, developer capabilities and provider incentives.2026SZShuning Zhang et al.Tsinghua UniversityPrivacy Perception & Decision-MakingPrivacy by Design & User ControlExplainable AI (XAI)CHI
Mind the Gap: Mapping Wearer–Bystander Privacy Tensions and Context-Adaptive Pathways for Camera GlassesCamera glasses create fundamental privacy tensions between wearers seeking recording functionality and bystanders concerned about unauthorized surveillance. We present a systematic multi-stakeholder evaluation of privacy mechanisms through surveys (N=525) and paired interviews (N=20) in China. Study 1 quantifies expectation-willingness gaps: bystanders consistently demand stronger information transparency and protective measures than wearers will provide, with disparities intensifying in sensitive contexts where 65–90\% of bystanders would take defensive action. Study 2 evaluates twelve privacy-enhancing technologies, revealing four fundamental trade-offs that undermine current approaches: visibility versus disruption, empowerment versus burden, protection versus agency, and accountability versus exposure. These gaps reflect structural incompatibilities rather than inadequate goodwill, with context emerging as the primary determinant of privacy acceptability. We propose context-adaptive pathways that dynamically adjust protection strategies: minimal-friction visibility in public spaces, structured negotiation in semi-public environments, and automatic protection in sensitive contexts. Our findings contribute a diagnostic framework for evaluating privacy mechanisms and implications for context-aware design in ubiquitous sensing.2026XWXueyang Wang et al.Tsinghua UniversityPrivacy by Design & User ControlPrivacy Perception & Decision-MakingContext-Aware ComputingCHI
PrivCAPTCHA: Interactive CAPTCHA to Facilitate Effective Comprehension of APP Privacy PolicyTraditional app privacy policies are often lengthy and non-interactive, leading users to skip them and remain uninformed. To address this, we proposed PrivCAP, a technique to enhance user comprehension by presenting policies in a concise, interactive format. PrivCAP adopted a CAPTCHA-based design, requiring users to interact with clickable chunks of concise policy content, thus reducing physical and cognitive load. A formative study (N=38) demonstrated that participants valued informed consent alongside concerns over data collection and sharing, marking the first such evaluation among Chinese users. This study further found a preference for concise visualizations and interactable formats. PrivCAP, leveraging few-shot prompting on Large Language Models (LLMs), accurately translates privacy policies into clickable, chunked formats optimized for smartphone screens. In an evaluation (N=28), PrivCAP outperformed traditional policy presentations in improving user understanding, reducing cognitive load, and maintaining efficiency, with participants favoring its engaging design and reporting more informed decision-making.2025SZShuning Zhang et al.Tsinghua University, Institute for Network Sciences and CyberspaceVR Medical Training & RehabilitationPrivacy by Design & User ControlPrivacy Perception & Decision-MakingCHI
Actual Achieved Gain and Optimal Perceived Gain: Modeling Human Take-over Decisions Towards Automated Vehicles' SuggestionsDriver decision quality in take-overs is critical for effective human-Autonomous Driving System (ADS) collaboration. However, current research lacks detailed analysis of its variations. This paper introduces two metrics--Actual Achieved Gain (AAG) and Optimal Perceived Gain (OPG)--to assess decision quality, with OPG representing optimal decisions and AAG reflecting actual outcomes. Both are calculated as weighted averages of perceived gains and losses, influenced by ADS accuracy. Study 1 (N=315) used a 21-point Thurstone scale to measure perceived gains and losses—key components of AAG and OPG—across typical tasks: route selection, overtaking, and collision avoidance. Studies 2 (N=54) and 3 (N=54) modeled decision quality under varying ADS accuracy and decision time. Results show with sufficient time (>3.5s), AAG converges towards OPG, indicating rational decision-making, while limited time leads to intuitive and deterministic choices. Study 3 also linked AAG-OPG deviations to irrational behaviors. An intervention study (N=8) and a pilot (N=4) employing voice alarms and multi-modal alarms based on these deviations demonstrated AAG's potential to improve decision quality.2025SZShuning Zhang et al.Tsinghua University, Institute for Network Sciences and CyberspaceAutomated Driving Interface & Takeover DesignHead-Up Display (HUD) & Advanced Driver Assistance Systems (ADAS)AI-Assisted Decision-Making & AutomationCHI
Raise Your Eyebrows Higher: Facilitating Emotional Communication in Social Virtual Reality Through Region-Specific Facial Expression ExaggerationWhile exaggerated facial expressions in cartoon avatars can enhance emotional communication in social virtual reality (VR), they risk triggering the uncanny valley effect. Our research reveals that this effect varies significantly across different emotions. In Study 1 (N=30), participants evaluated scaled facial expressions during simulated VR conversations. We found that expression exaggeration had opposing effects: it decreased facial realism for joy, surprise, and disgust due to overly dramatic mouth movements, while enhancing realism for fear, sadness, and anger—emotions that rely on upper facial expressions typically constrained by HMD pressure. Based on these findings, we developed a region-specific facial expression exaggeration strategy that enhances under-expressed upper facial features while maintaining natural lower facial movements. Study 2 (N=20) validated this approach, demonstrating enhanced emotional intensity and contagion for negative emotions while mitigating the uncanny valley effect. Our research provides practical guidelines for optimizing avatar-mediated emotional communication in social VR environments.2025XWXueyang Wang et al.Tsinghua University, Institute for Network Sciences and CyberspaceSocial & Collaborative VRImmersion & Presence ResearchIdentity & Avatars in XRCHI
From 2D to 3D: Facilitating Single-Finger Mid-Air Typing on QWERTY Keyboards with Probabilistic Touch ModelingMid-air text entry on virtual keyboards suffers from the lack of tactile feedback, which brings challenges to both tap detection and input prediction. In this paper, we explored the feasibility of single-finger typing on virtual QWERTY keyboards in mid-air. We first conducted a study to examine users' 3D typing behavior on different sizes of virtual keyboards. Results showed that the participants perceived the vertical projection of the lowest point on the keyboard during a tap as the target location and inferring taps based on the intersection between the finger and the keyboard was not applicable. Aiming at this challenge, we derived a novel input prediction algorithm that took the uncertainty in tap detection into a calculation as probability, and performed probabilistic decoding that could tolerate false detection. We analyzed the performance of the algorithm through a full-factorial simulation. Results showed that the SVM-based probabilistic touch detection together with a 2D elastic probabilistic decoding algorithm (elasticity = 2) could achieve the optimal top-5 accuracy of 94.2%. In the evaluation user study, the participants reached a single-finger typing speed of 26.1 WPM with 3.2% uncorrected word-level error rate, which was significantly better than both tap-based and gesture-based baseline techniques. Also, the proposed technique received the highest preference score from the users, proving its usability in real text entry tasks. https://dl.acm.org/doi/10.1145/35808292023XYXin Yi et al.Mid-Air Haptics (Ultrasonic)Hand Gesture RecognitionVoice User Interface (VUI) DesignUbiComp
Squeez'In: Private Authentication on Smartphones based on Squeezing GesturesIn this paper, we proposed \emph{Squeez'In}, a technique on smartphones that enabled private authentication by holding and squeezing the phone with a unique pattern. We first explored the design space of practical squeezing gestures for authentication by analyzing the participants' self-designed gestures and squeezing behavior. Results showed that varying-length gestures with two levels of touch pressure and duration were the most natural and unambiguous. We then implemented \emph{Squeez'In} on an off-the-shelf capacitive sensing smartphone, and employed an SVM-GBDT model for recognizing gestures and user-specific behavioral patterns, achieving 99.3\% accuracy and 0.93 F1-score when tested on 21 users. A following 14-day study validated the memorability and long-term stability of \proj. During usability evaluation, compared with gesture and pin code, \emph{Squeez'In} achieved significantly faster authentication speed and higher user preference in terms of privacy and security.2023XYXin Yi et al.Tsinghua UniversityForce Feedback & Pseudo-Haptic WeightPasswords & AuthenticationCHI