AI and My Values: User Perceptions of LLMs’ Ability to Extract, Embody, and Explain Human Values from Casual Conversations
Honorable MentionAuthors
Bhada Yun
ETH Zürich
Renn Su
Stanford University
ETH Zurich
Paper Title
AI and My Values: User Perceptions of LLMs’ Ability to Extract, Embody, and Explain Human Values from Casual Conversations
Publication Info
- Topic area: Human-AI interaction and value alignment in conversational AI systems.
- Keywords: Value alignment, conversational AI, LLMs, human values, user perception, explainability, embodiment, privacy, ethics, self-reflection.
Background and Problem
- Problem / challenge: Existing AI systems lack robust mechanisms to align with individual human values, and their ability to infer, embody, and explain these values remains underexplored. This gap raises risks of misrepresentation, privacy violations, and ethical concerns.
- Significance: Understanding and aligning AI systems with human values is critical as these systems increasingly mediate personal, professional, and societal interactions. Misaligned systems could harm trust, privacy, and user welfare.
- Motivation and related work: Previous research has focused on shared moral principles or static datasets for value alignment but has not addressed individual-level, dynamic value modeling. This paper builds on frameworks like Schwartz’s Theory of Values and explores how conversational AI systems can infer and represent personal values through sustained interaction.
Solution
- Proposed approach: Introduction of the Value-Alignment Perception Toolkit (VAPT), a methodology to evaluate AI systems’ ability to extract, embody, and explain human values based on conversational data.
- Novelty:
- Development of VAPT as a reusable, probe-based methodology for studying perceived value alignment.
- Empirical study of user perceptions of AI’s value alignment capabilities using a month-long chatbot interaction.
- Insights into the risks of “weaponized empathy” and design implications for value-aligned conversational agents (VACAs).
- Introduction of a three-stage evaluation framework: extraction (topic-context graphs), embodiment (persona responses), and explanation (value chart comparisons).
- Procedure and key techniques:
- Data collection: Participants engaged in casual conversations with a chatbot (“Day”) over a month, generating rich interaction data.
- Baseline establishment: Schwartz’s 57-item PVQ-RR survey was used to establish participants’ self-reported values.
- Evaluation stages:
- Stage 1: Topic-context graph exploration to visualize extracted values.
- Stage 2: Persona embodiment experiment to assess AI’s ability to simulate user responses.
- Stage 3: Value chart comparison to evaluate alignment between AI-inferred and self-reported values.
Results
- Concrete findings:
- AI-inferred values moderately aligned with self-reported values (63.6% within ±1 Likert point).
- Participants rated chat-based personas as more aligned (77%) than survey-based or random baselines, particularly for personalized questions.
- 65% of participants believed AI could understand human values, but only 35% believed AI could have them.
- Advantage over baselines:
- Chat-based personas captured individual voice and specific lived experiences better than survey-based or random personas.
- AI explanations helped participants audit and sometimes revise their self-perceptions.
- Experiments / evaluation:
- Mixed-methods study with 20 participants from diverse cultural and professional backgrounds.
- Participants engaged in 8+ chat sessions and a 2-hour semi-structured interview.
- Metrics included alignment scores, Likert ratings, and qualitative feedback on AI-generated artifacts.
- Limitations and future work:
- Small, young, and tech-savvy participant sample limits generalizability.
- Over-representation of autonomy values and under-representation of tradition/power values in AI outputs.
- Future work should explore older, less tech-savvy populations and refine models for cultural and linguistic nuances.
Summary
This study introduced VAPT, a toolkit for evaluating AI systems’ ability to extract, embody, and explain human values through conversational data. Using a month-long chatbot interaction, the study found moderate alignment between AI-inferred and self-reported values, with chat-based personas outperforming survey-based baselines in capturing individual voice and nuance. Participants appreciated the AI’s ability to surface latent patterns but raised concerns about privacy and the risks of “weaponized empathy.” The findings highlight the need for value-aligned conversational agents that prioritize user consent, self-reflection, and transparency to mitigate automation bias and preserve human agency.
Quick Actions
Learn AI Coding at CodeNow
Paper Snapshot
Share This Paper
https://hci.top/en/papers/chi/223539/2026