agentAR: Creating Augmented Reality Applications with Tool-Augmented LLM-based Autonomous AgentsCreating Augmented Reality (AR) applications requires expertise in both design and implementation, posing significant barriers to entry for non-expert users. While existing methods reduce some of this burden, they often fall short in flexibility or usability for complex or varied use cases. To address this, we introduce agentAR, an AR authoring system that leverages a tool-augmented large language model (LLM)–based autonomous agent to support end-to-end, in-situ AR application creation from natural language input. Built on an application structure and tool library derived from state-of-the-art AR research, the agent autonomously creates AR applications from natural language dialogue. We demonstrate the effectiveness of agentAR through a case study of six AR applications and a user study with twelve participants, showing that it significantly reduces user effort while supporting the creation of diverse and functional AR experiences.2025CZChenfei Zhu et al.AR Navigation & Context AwarenessGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationUIST
GesPrompt: Leveraging Co-Speech Gestures to Augment LLM-Based Interaction in Virtual RealityLarge Language Model (LLM)-based copilots have shown great potential in Extended Reality (XR) applications. However, the user faces challenges when describing the 3D environments to the copilots due to the complexity of conveying spatial-temporal information through text or speech alone. To address this, we introduce GesPrompt, a multimodal XR interface that combines co-speech gestures with speech, allowing end-users to communicate more naturally and accurately with LLM-based copilots in XR environments. By incorporating gestures, GesPrompt extracts spatial-temporal reference from co-speech gestures, reducing the need for precise textual prompts and minimizing cognitive load for end-users. Our contributions include (1) a workflow to integrate gesture and speech input in the XR environment, (2) a prototype VR system that implements the workflow, and (3) a user study demonstrating its effectiveness in improving user communication in VR environments.2025XHXiyun Hu et al.Hand Gesture RecognitionMixed Reality WorkspacesHuman-LLM CollaborationDIS
DesignFromX:Empowering Consumer-Driven Design Space Exploration through Feature Composition of Referenced ProductsIndustrial products are designed to satisfy the needs of consumers. The rise of generative artificial intelligence (GenAI) enables consumers to easily modify a product by prompting a generative model, opening up opportunities to incorporate consumers in exploring the product design space. However, consumers often struggle to articulate their preferred product features due to their unfamiliarity with terminology and their limited understanding of the structure of product features. We present DesignFromX, a system that empowers consumer-driven design space exploration by helping consumers to design a product based on their preferences. Leveraging an effective GenAI-based framework, the system allows users to easily identify design features from product images and compose those features to generate conceptual images and 3D models of a new product. A user study with 24 participants demonstrates that DesignFromX lowers the barriers and frustration for consumer-driven design space explorations by enhancing both engagement and enjoyment for the participants.2025RDRunlin Duan et al.Generative AI (Text, Image, Music, Video)Creative Collaboration & Feedback SystemsCustomizable & Personalized ObjectsDIS
CARING-AI: Towards Authoring Context-aware Augmented Reality INstruction through Generative Artificial IntelligenceContext-aware AR instruction enables adaptive and in-situ learning experiences. However, hardware limitations and expertise requirements constrain the creation of such instructions. With recent developments in Generative Artificial Intelligence (Gen-AI), current research tries to tackle these constraints by deploying AI-generated content (AIGC) in AR applications. However, our preliminary study with six AR practitioners revealed that the current AIGC lacks contextual information to adapt to varying application scenarios and is therefore limited in authoring. To utilize the strong generative power of GenAI to ease the authoring of AR instruction while capturing the context, we developed CARING-AI, an AR system to author context-aware humanoid-avatar-based instructions with GenAI. By navigating in the environment, users naturally provide contextual information to generate humanoid-avatar animation as AR instructions that blend in the context spatially and temporally. We showcased three application scenarios of CARING-AI: Asynchronous Instructions, Remote Instructions, and Ad Hoc Instructions based on a design space of AIGC in AR Instructions. With two user studies (N=12), we assessed the system usability of CARING-AI and demonstrated the easiness and effectiveness of authoring with Gen-AI.2025JSJingyu Shi et al.Purdue University, Elmore Family School of Electrical and Computer EngineeringAR Navigation & Context AwarenessGenerative AI (Text, Image, Music, Video)Human-LLM CollaborationCHI
"Kya family planning after marriage hoti hai?": Integrating Cultural Sensitivity in an LLM Chatbot for Reproductive HealthAccess to sexual and reproductive health information remains a challenge in many communities globally, due to cultural taboos and limited availability of healthcare providers. Public health organizations are increasingly turning to Large Language Models (LLMs) to improve access to timely and personalized information. However, recent HCI scholarship indicates that significant challenges remain in incorporating context awareness and mitigating bias in LLMs. In this paper, we study the development of a culturally-appropriate LLM-based chatbot for reproductive health with underserved women in urban India. Through user interactions, focus groups, and interviews with multiple stakeholders, we examine the chatbot’s response to sensitive and highly contextual queries on reproductive health. Our findings reveal strengths and limitations of the system in capturing local context, and complexities around what constitutes ``culture''. Finally, we discuss how local context might be better integrated, and present a framework to inform the design of culturally-sensitive chatbots for community health.2025RDRoshini Deva et al.Emory University, Biomedical InformaticsHuman-LLM CollaborationReproductive & Women's HealthCHI
Transparent Barriers: Natural Language Access Control Policies for XR-Enhanced Everyday ObjectsExtended Reality (XR)-enabled headsets that overlay digital content onto the physical world, are gradually finding their way into our daily life. This integration raises significant concerns about privacy and access control, especially in shared spaces where XR applications interact with everyday objects. Such issues remain subtle in the absence of widespread applications of XR and studies in shared spaces are required for a smooth progress. This study evaluated a prototype system facilitating natural language policy creation for flexible, context-aware access control of personal objects. We assessed its usability, focusing on balancing precision and user effort in creating access control policies. Qualitative interviews and task-based interactions provided insights into users' preferences and behaviors, informing future design directions. Findings revealed diverse user needs for controlling access to personal items in various situations, emphasizing the need for flexible, user-friendly access control in XR-enhanced shared spaces that respects boundaries and considers social contexts.2025KTKentaro Taninaka et al.Keio University, Graduate School of Media and GovernanceAR Navigation & Context AwarenessPrivacy by Design & User ControlCHI
avaTTAR: Table Tennis Stroke Training with On-body and Detached Visualization in Augmented RealityTable tennis stroke training is a critical aspect of player development. We designed a new augmented reality (AR) system, avaTTAR, for table tennis stroke training. The system provides both “on-body” (first-person view) and “detached” (third-person view) visual cues, enabling users to visualize target strokes and correct their attempts effectively with this dual perspectives setup. By employing a combination of pose estimation algorithms and IMU sensors, avaTTAR captures and reconstructs the 3D body pose and paddle orientation of users during practice, allowing real-time comparison with expert strokes. Through a user study, we affirm avaTTAR ’s capacity to amplify player experience and training results2024DMDizhi Ma et al.Full-Body Interaction & Embodied InputAR Navigation & Context AwarenessVR Medical Training & RehabilitationUIST
ChatDirector: Enhancing Video Conferencing with Space-Aware Scene Rendering and Speech-Driven Layout TransitionRemote video conferencing systems (RVCS) are widely adopted in personal and professional communication. However, they often lack the co-presence experience of in-person meetings. This is largely due to the absence of intuitive visual cues and clear spatial relationships among remote participants, which can lead to speech interruptions and loss of attention. This paper presents ChatDirector, a novel RVCS that overcomes these limitations by incorporating space-aware visual presence and speech-aware attention transition assistance. ChatDirector employs a real-time pipeline that converts participants' RGB video streams into 3D portrait avatars and renders them in a virtual 3D scene. We also contribute a decision tree algorithm that directs the avatar layouts and behaviors based on participants' speech states. We report on results from a user study (N=16) where we evaluated ChatDirector. The satisfactory algorithm performance and complimentary subject user feedback imply that ChatDirector significantly enhances communication efficacy and user engagement.2024XQXun Qian et al.Purdue UniversitySocial & Collaborative VRMixed Reality WorkspacesCHI
ClassMeta: Designing Interactive Virtual Classmate to Promote VR Classroom ParticipationPeer influence plays a crucial role in promoting classroom participation, where behaviors from active students can contribute to a collective classroom learning experience. However, the presence of these active students depends on several conditions and is not consistently available across all circumstances. Recently, Large Language Models (LLMs) such as GPT have demonstrated the ability to simulate diverse human behaviors convincingly due to their capacity to generate contextually coherent responses based on their role settings. Inspired by this advancement in technology, we designed ClassMeta, a GPT-4 powered agent to help promote classroom participation by playing the role of an active student. These agents, which are embodied as 3D avatars in virtual reality, interact with actual instructors and students with both spoken language and body gestures. We conducted a comparative study to investigate the potential of ClassMeta for improving the overall learning experience of the class.2024ZLZiyi Liu et al.Purdue UniversitySocial & Collaborative VRHuman-LLM CollaborationIntelligent Tutoring Systems & Learning AnalyticsCHI
Ubi-TOUCH: Ubiquitous Tangible Object Utilization through Consistent Hand-object interaction in Augmented RealityUtilizing everyday objects as tangible proxies for Augmented Reality (AR) provides users with haptic feedback while interacting with virtual objects. Yet, existing methods focus on the attributes of the objects, constraining the possible proxies and yielding inconsistency in user experience. Therefore, we propose Ubi-TOUCH, an AR system that assists users in seeking a wider range of tangible proxies for AR applications based on the hand-object interaction (HOI) they desire. Given the target interaction with a virtual object, the system scans the users' vicinity and recommends object proxies with similar interactions. Upon user selection, the system simultaneously tracks and maps users' physical HOI to the virtual HOI, adaptively optimizing object 6 DoF and the hand gesture to provide consistency between the interactions. We showcase promising use cases of Ubi-TOUCH, such as remote tutorials, AR gaming, and Smart Home control. Finally, we evaluate the performance and usability of Ubi-TOUCH with a user study.2023RJRahul Jain et al.Full-Body Interaction & Embodied InputAR Navigation & Context AwarenessMixed Reality WorkspacesUIST
LearnIoTVR: An End-to-end Virtual Reality Environment Providing Authentic Learning Experiences for Internet of ThingsThe rapid growth of Internet-of-Things (IoT) applications has generated interest from many industries and a need for graduates with relevant knowledge. An IoT system is comprised of spatially distributed interactions between humans and various interconnected IoT components. These interactions are contextualized within their ambient environment, thus impeding educators from recreating authentic tasks for hands-on IoT learning. We propose LearnIoTVR, an end-to-end virtual reality (VR) learning environment which helps students to acquire IoT knowledge through immersive design, programming, and exploration of real-world environments empowered by IoT (e.g., a smart house). The students start the learning process by installing virtual IoT components we created in different locations inside the VR environment so that the learning will be situated in the same context where the IoT is applied. With our custom-designed 3D block-based language, students can program IoT behaviors directly within VR and get immediate feedback on their programming outcome. In the user study, we evaluated the learning outcomes among students using LearnIoTVR with a pre- and post-test to understand to what extent does engagement in LearnIoTVR lead to gains in learning programming skills and IoT competencies. Additionally, we examined what aspects of LearnIoTVR support usability and learning of programming skills compared to a traditional desktop-based learning environment. The results from these studies were promising. We also acquired insightful user feedback which provides inspiration for further expansions of this system.2023ZZZhengzhe Zhu et al.Purdue UniversityAR Navigation & Context AwarenessProgramming Education & Computational ThinkingK-12 Digital Education ToolsCHI
InstruMentAR: Auto-Generation of Augmented Reality Tutorials for Operating Digital Instruments Through Recording Embodied DemonstrationAugmented Reality tutorials, which provide necessary context by directly superimposing visual guidance on the physical referent, represent an effective way of scaffolding complex instrument operations. However, current AR tutorial authoring processes are not seamless as they require users to continuously alternate between operating instruments and interacting with virtual elements. We present InstruMentAR, a system that automatically generates AR tutorials through recording user demonstrations. We design a multimodal approach that fuses gestural information and hand-worn pressure sensor data to detect and register the user's step-by-step manipulations on the control panel. With this information, the system autonomously generates virtual cues with designated scales to respective locations for each step. Voice recognition and background capture are employed to automate the creation of text and images as AR content. For novice users receiving the authored AR tutorials, we facilitate immediate feedback through haptic modules. We compared InstruMentAR with traditional systems in the user study.2023ZLZiyi Liu et al.Purdue UniversityIn-Vehicle Haptic, Audio & Multimodal FeedbackHand Gesture RecognitionAR Navigation & Context AwarenessCHI
Ubi Edge: Authoring Edge-Based Opportunistic Tangible User Interfaces in Augmented RealityEdges are one of the most ubiquitous geometric features of physical objects. They provide accurate haptic feedback and easy-to-track features for camera systems, making them an ideal basis for Tangible User Interfaces (TUI) in Augmented Reality (AR). We introduce Ubi Edge, an AR authoring tool that allows end-users to customize edges on daily objects as TUI inputs to control varied digital functions. We develop an integrated AR device and an integrated vision-based detection pipeline that can track 3D edges and detect the touch interaction between fingers and edges. Leveraging the spatial awareness of AR, users can simply select an edge by sliding fingers along it and then make the edge interactive by connecting it to various digital functions. We demonstrate four use cases including multi-function controllers, smart homes, games, and TUI-based tutorials. We also evaluated and proved our system’s usability through a two-session user study, where qualitative and quantitative results are positive.2023FHFengming He et al.Purdue UniversityShape-Changing Interfaces & Soft Robotic MaterialsAR Navigation & Context AwarenessCHI
ARnnotate: An Augmented Reality Interface for Collecting Custom Dataset of 3D Hand-Object Interaction Pose EstimationVision-based 3D pose estimation has substantial potential in hand-object interaction applications and requires user-specified datasets to achieve robust performance. We propose ARnnotate, an Augmented Reality (AR) interface enabling end-users to create custom data using a hand-tracking-capable AR device. Unlike other dataset collection strategies, ARnnotate first guides a user to manipulate a virtual bounding box and records its poses and the user's hand joint positions as the labels. By leveraging the spatial awareness of AR, the user manipulates the corresponding physical object while following the in-situ AR animation of the bounding box and hand model, while ARnnotate captures the user's first-person view as the images of the dataset. A 12-participant user study was conducted, and the results proved the system's usability in terms of the spatial accuracy of the labels, the satisfactory performance of the deep neural networks trained with the data collected by ARnnotate, and the users' subjective feedback.2022XQXun Qian et al.Hand Gesture RecognitionEye Tracking & Gaze InteractionHuman Pose & Activity RecognitionUIST
MechARspace: An Authoring System Enabling Bidirectional Binding of AR with Toys in Real-timeAugmented Reality (AR), which blends physical and virtual worlds, presents the possibility of enhancing traditional toy design. By leveraging bidirectional virtual-physical interactions between humans and the designed artifact, such AR-enhanced toys can provide more playful and interactive experiences for traditional toys. However, designers are constrained by the complexity and technical difficulties of the current AR content creation processes. We propose MechARspace, an immersive authoring system that supports users to create toy-AR interactions through direct manipulation and visual programming. Based on the elicitation study, we propose a bidirectional interaction model which maps both ways: from the toy inputs to reactions of AR content, and also from the AR content to the toy reactions. This model guides the design of our system which includes a plug-and-play hardware toolkit and an in-situ authoring interface. We present multiple use cases enabled by MechARspace to validate this interaction model. Finally, we evaluate our system with a two-session user study where users first recreated a set of predefined toy-AR interactions and then implemented their own AR-enhanced toy designs.2022ZZZhengzhe Zhu et al.Mixed Reality WorkspacesUIST
ScalAR: Authoring Semantically Adaptive Augmented Reality Experiences in Virtual RealityAugmented Reality (AR) experiences tightly associate virtual contents with environmental entities. However, the dissimilarity of different environments limits the adaptive AR content behaviors under large-scale deployment. We propose ScalAR, an integrated workflow enabling designers to author semantically adaptive AR experiences in Virtual Reality (VR). First, potential AR consumers collect local scenes with a semantic understanding technique. ScalAR then synthesizes numerous similar scenes. In VR, a designer authors the AR contents' semantic associations and validates the design while being immersed in the provided scenes. We adopt a decision-tree-based algorithm to fit the designer’s demonstrations as a semantic adaptation model to deploy the authored AR experience in a physical scene. We further showcase two application scenarios authored by ScalAR and conduct a two-session user study where the quantitative results prove the accuracy of the AR content rendering and the qualitative results show the usability of ScalAR.2022XQXun Qian et al.Purdue UniversityAR Navigation & Context AwarenessMixed Reality WorkspacesCHI
Towards Modeling of Virtual Reality Welding Simulators to promote Accessible and Scalable TrainingThe US manufacturing industry is currently facing a welding workforce shortage which is largely due to inadequacy of widespread welding training. To address this challenge, we present a Virtual Reality (VR)-based training system aimed at transforming state-of-the-art-welding simulations and in-person instruction into a widely accessible and engaging platform. We applied backward design principles to design a low-cost welding simulator in the form of modularized units through active consulting with welding training experts. Using a minimum viable prototype, we conducted a user study with 24 novices to test the system’s usability. Our findings show (1) greater effectiveness of the system in transferring skills to real-world environments as compared to accessible video-based alternatives and, (2) the visuo-haptic guidance during virtual welding enhances performance and provides a realistic learning experience to users. Using the solution, we expect inexperienced users to achieve competencies faster and be better prepared to enter actual work environments.2022AIAnanya Ipsita et al.Purdue University, Purdue UniversityVR Medical Training & RehabilitationSurgical Assistance & Medical TrainingCHI
GesturAR: An Authoring System for Creating Freehand Interactive Augmented Reality ApplicationsThe freehand gesture is an essential input modality for modern Augmented Reality (AR) user experiences. However, developing AR applications with customized hand interactions remains a challenge for end-users. Therefore, we propose GesturAR, an end-to-end authoring tool that supports users to create in-situ freehand AR applications through embodied demonstration and visual programming. During authoring, users can intuitively demonstrate the customized gesture inputs while referring to the spatial and temporal context. Based on the taxonomy of gestures in AR, we proposed a hand interaction model which maps the gesture inputs to the reactions of the AR contents. Thus, users can author comprehensive freehand applications using trigger-action visual programming and instantly experience the results in AR. Further, we demonstrate multiple application scenarios enabled by GesturAR, such as interactive virtual objects, robots, and avatars, room-level interactive AR spaces, embodied AR presentations, etc. Finally, we evaluate the performance and usability of GesturAR through a user study.2021TWTianyi Wang et al.Hand Gesture RecognitionAR Navigation & Context AwarenessUIST
ProcessAR: An Augmented Reality-Based Tool to Create In-Situ Procedural 2D/3D AR InstructionsAugmented reality (AR) is an efficient form of delivering spatial information and has great potential for training workers. However, AR is still not widely used for such scenarios due to the technical skills and expertise required to create interactive AR content. We developed ProcessAR, an AR-based system to develop 2D/3D content that captures subject matter expert's (SMEs) environment-object interactions in situ. The design space for ProcessAR was identified from formative interviews with AR programming experts and SMEs, alongside a comparative design study with SMEs and novice users. To enable smooth workflows, ProcessAR locates and identifies different tools/objects through computer vision within the workspace when the author looks at them. We explored additional features such as embedding 2D videos with detected objects and user-adaptive triggers. A final user evaluation comparing ProcessAR and a baseline AR authoring environment showed that, according to our qualitative questionnaire, users preferred ProcessAR.2021SCSubramanian Chidambaram et al.AR Navigation & Context AwarenessContext-Aware ComputingPrototyping & User TestingDIS
RobotAR: An Augmented Reality Compatible Teleconsulting Robotics Toolkit for Augmented Makerspaces ExperiencesDistance learning is facing a critical moment finding a balance between high quality education for remote students and engaging them in hands-on learning. This is particularly relevant for project-based classrooms and makerspaces, which typically require extensive trouble-shooting and example demonstrations from instructors. We present RobotAR, a teleconsulting robotics toolkit for creating Augmented Reality (AR) makerspaces. We present the hardware and software for an AR-compatible robot, which behaves as a student’s voice assistant and can be embodied by the instructor for teleconsultation. As a desktop-based teleconsulting agent, the instructor has control of the robot’s joints and position to better focus on areas of interest inside the workspace. Similarly, the instructor has access to the student’s virtual environment and the capability to create AR content to aid the student with problem-solving. We also performed a user study which compares current techniques for distance hands-on learning and an implementation of our toolkit.2021AVAna M Villanueva et al.Purdue UniversityMixed Reality WorkspacesRemote Work Tools & ExperienceWarehouse & Industrial RobotsCHI