ChatDirector: Enhancing Video Conferencing with Space-Aware Scene Rendering and Speech-Driven Layout TransitionRemote video conferencing systems (RVCS) are widely adopted in personal and professional communication. However, they often lack the co-presence experience of in-person meetings. This is largely due to the absence of intuitive visual cues and clear spatial relationships among remote participants, which can lead to speech interruptions and loss of attention. This paper presents ChatDirector, a novel RVCS that overcomes these limitations by incorporating space-aware visual presence and speech-aware attention transition assistance. ChatDirector employs a real-time pipeline that converts participants' RGB video streams into 3D portrait avatars and renders them in a virtual 3D scene. We also contribute a decision tree algorithm that directs the avatar layouts and behaviors based on participants' speech states. We report on results from a user study (N=16) where we evaluated ChatDirector. The satisfactory algorithm performance and complimentary subject user feedback imply that ChatDirector significantly enhances communication efficacy and user engagement.2024XQXun Qian et al.Purdue UniversitySocial & Collaborative VRMixed Reality WorkspacesCHI
Ubi-TOUCH: Ubiquitous Tangible Object Utilization through Consistent Hand-object interaction in Augmented RealityUtilizing everyday objects as tangible proxies for Augmented Reality (AR) provides users with haptic feedback while interacting with virtual objects. Yet, existing methods focus on the attributes of the objects, constraining the possible proxies and yielding inconsistency in user experience. Therefore, we propose Ubi-TOUCH, an AR system that assists users in seeking a wider range of tangible proxies for AR applications based on the hand-object interaction (HOI) they desire. Given the target interaction with a virtual object, the system scans the users' vicinity and recommends object proxies with similar interactions. Upon user selection, the system simultaneously tracks and maps users' physical HOI to the virtual HOI, adaptively optimizing object 6 DoF and the hand gesture to provide consistency between the interactions. We showcase promising use cases of Ubi-TOUCH, such as remote tutorials, AR gaming, and Smart Home control. Finally, we evaluate the performance and usability of Ubi-TOUCH with a user study.2023RJRahul Jain et al.Full-Body Interaction & Embodied InputAR Navigation & Context AwarenessMixed Reality WorkspacesUIST
LearnIoTVR: An End-to-end Virtual Reality Environment Providing Authentic Learning Experiences for Internet of ThingsThe rapid growth of Internet-of-Things (IoT) applications has generated interest from many industries and a need for graduates with relevant knowledge. An IoT system is comprised of spatially distributed interactions between humans and various interconnected IoT components. These interactions are contextualized within their ambient environment, thus impeding educators from recreating authentic tasks for hands-on IoT learning. We propose LearnIoTVR, an end-to-end virtual reality (VR) learning environment which helps students to acquire IoT knowledge through immersive design, programming, and exploration of real-world environments empowered by IoT (e.g., a smart house). The students start the learning process by installing virtual IoT components we created in different locations inside the VR environment so that the learning will be situated in the same context where the IoT is applied. With our custom-designed 3D block-based language, students can program IoT behaviors directly within VR and get immediate feedback on their programming outcome. In the user study, we evaluated the learning outcomes among students using LearnIoTVR with a pre- and post-test to understand to what extent does engagement in LearnIoTVR lead to gains in learning programming skills and IoT competencies. Additionally, we examined what aspects of LearnIoTVR support usability and learning of programming skills compared to a traditional desktop-based learning environment. The results from these studies were promising. We also acquired insightful user feedback which provides inspiration for further expansions of this system.2023ZZZhengzhe Zhu et al.Purdue UniversityAR Navigation & Context AwarenessProgramming Education & Computational ThinkingK-12 Digital Education ToolsCHI
InstruMentAR: Auto-Generation of Augmented Reality Tutorials for Operating Digital Instruments Through Recording Embodied DemonstrationAugmented Reality tutorials, which provide necessary context by directly superimposing visual guidance on the physical referent, represent an effective way of scaffolding complex instrument operations. However, current AR tutorial authoring processes are not seamless as they require users to continuously alternate between operating instruments and interacting with virtual elements. We present InstruMentAR, a system that automatically generates AR tutorials through recording user demonstrations. We design a multimodal approach that fuses gestural information and hand-worn pressure sensor data to detect and register the user's step-by-step manipulations on the control panel. With this information, the system autonomously generates virtual cues with designated scales to respective locations for each step. Voice recognition and background capture are employed to automate the creation of text and images as AR content. For novice users receiving the authored AR tutorials, we facilitate immediate feedback through haptic modules. We compared InstruMentAR with traditional systems in the user study.2023ZLZiyi Liu et al.Purdue UniversityIn-Vehicle Haptic, Audio & Multimodal FeedbackHand Gesture RecognitionAR Navigation & Context AwarenessCHI
XAIR: A Framework of Explainable AI in Augmented RealityExplainable AI (XAI) has established itself as an important component of AI-driven interactive systems. With Augmented Reality (AR) becoming more integrated in daily lives, the role of XAI also becomes essential in AR because end-users will frequently interact with intelligent services. However, it is unclear how to design effective XAI experiences for AR. We propose XAIR, a design framework that addresses when, what, and how to provide explanations of AI output in AR. The framework was based on a multi-disciplinary literature review of XAI and HCI research, a large-scale survey probing 500+ end-users’ preferences for AR-based explanations, and three workshops with 12 experts collecting their insights about XAI design in AR. XAIR's utility and effectiveness was verified via a study with 10 designers and another study with 12 end-users. XAIR can provide guidelines for designers, inspiring them to identify new design opportunities and achieve effective XAI designs in AR.2023XXXuhai Xu et al.Reality Labs Research, University of WashingtonAR Navigation & Context AwarenessExplainable AI (XAI)CHI
Ubi Edge: Authoring Edge-Based Opportunistic Tangible User Interfaces in Augmented RealityEdges are one of the most ubiquitous geometric features of physical objects. They provide accurate haptic feedback and easy-to-track features for camera systems, making them an ideal basis for Tangible User Interfaces (TUI) in Augmented Reality (AR). We introduce Ubi Edge, an AR authoring tool that allows end-users to customize edges on daily objects as TUI inputs to control varied digital functions. We develop an integrated AR device and an integrated vision-based detection pipeline that can track 3D edges and detect the touch interaction between fingers and edges. Leveraging the spatial awareness of AR, users can simply select an edge by sliding fingers along it and then make the edge interactive by connecting it to various digital functions. We demonstrate four use cases including multi-function controllers, smart homes, games, and TUI-based tutorials. We also evaluated and proved our system’s usability through a two-session user study, where qualitative and quantitative results are positive.2023FHFengming He et al.Purdue UniversityShape-Changing Interfaces & Soft Robotic MaterialsAR Navigation & Context AwarenessCHI
ARnnotate: An Augmented Reality Interface for Collecting Custom Dataset of 3D Hand-Object Interaction Pose EstimationVision-based 3D pose estimation has substantial potential in hand-object interaction applications and requires user-specified datasets to achieve robust performance. We propose ARnnotate, an Augmented Reality (AR) interface enabling end-users to create custom data using a hand-tracking-capable AR device. Unlike other dataset collection strategies, ARnnotate first guides a user to manipulate a virtual bounding box and records its poses and the user's hand joint positions as the labels. By leveraging the spatial awareness of AR, the user manipulates the corresponding physical object while following the in-situ AR animation of the bounding box and hand model, while ARnnotate captures the user's first-person view as the images of the dataset. A 12-participant user study was conducted, and the results proved the system's usability in terms of the spatial accuracy of the labels, the satisfactory performance of the deep neural networks trained with the data collected by ARnnotate, and the users' subjective feedback.2022XQXun Qian et al.Hand Gesture RecognitionEye Tracking & Gaze InteractionHuman Pose & Activity RecognitionUIST
ScalAR: Authoring Semantically Adaptive Augmented Reality Experiences in Virtual RealityAugmented Reality (AR) experiences tightly associate virtual contents with environmental entities. However, the dissimilarity of different environments limits the adaptive AR content behaviors under large-scale deployment. We propose ScalAR, an integrated workflow enabling designers to author semantically adaptive AR experiences in Virtual Reality (VR). First, potential AR consumers collect local scenes with a semantic understanding technique. ScalAR then synthesizes numerous similar scenes. In VR, a designer authors the AR contents' semantic associations and validates the design while being immersed in the provided scenes. We adopt a decision-tree-based algorithm to fit the designer’s demonstrations as a semantic adaptation model to deploy the authored AR experience in a physical scene. We further showcase two application scenarios authored by ScalAR and conduct a two-session user study where the quantitative results prove the accuracy of the AR content rendering and the qualitative results show the usability of ScalAR.2022XQXun Qian et al.Purdue UniversityAR Navigation & Context AwarenessMixed Reality WorkspacesCHI
GesturAR: An Authoring System for Creating Freehand Interactive Augmented Reality ApplicationsThe freehand gesture is an essential input modality for modern Augmented Reality (AR) user experiences. However, developing AR applications with customized hand interactions remains a challenge for end-users. Therefore, we propose GesturAR, an end-to-end authoring tool that supports users to create in-situ freehand AR applications through embodied demonstration and visual programming. During authoring, users can intuitively demonstrate the customized gesture inputs while referring to the spatial and temporal context. Based on the taxonomy of gestures in AR, we proposed a hand interaction model which maps the gesture inputs to the reactions of the AR contents. Thus, users can author comprehensive freehand applications using trigger-action visual programming and instantly experience the results in AR. Further, we demonstrate multiple application scenarios enabled by GesturAR, such as interactive virtual objects, robots, and avatars, room-level interactive AR spaces, embodied AR presentations, etc. Finally, we evaluate the performance and usability of GesturAR through a user study.2021TWTianyi Wang et al.Hand Gesture RecognitionAR Navigation & Context AwarenessUIST
ProcessAR: An Augmented Reality-Based Tool to Create In-Situ Procedural 2D/3D AR InstructionsAugmented reality (AR) is an efficient form of delivering spatial information and has great potential for training workers. However, AR is still not widely used for such scenarios due to the technical skills and expertise required to create interactive AR content. We developed ProcessAR, an AR-based system to develop 2D/3D content that captures subject matter expert's (SMEs) environment-object interactions in situ. The design space for ProcessAR was identified from formative interviews with AR programming experts and SMEs, alongside a comparative design study with SMEs and novice users. To enable smooth workflows, ProcessAR locates and identifies different tools/objects through computer vision within the workspace when the author looks at them. We explored additional features such as embedding 2D videos with detected objects and user-adaptive triggers. A final user evaluation comparing ProcessAR and a baseline AR authoring environment showed that, according to our qualitative questionnaire, users preferred ProcessAR.2021SCSubramanian Chidambaram et al.AR Navigation & Context AwarenessContext-Aware ComputingPrototyping & User TestingDIS
AdapTutAR: an Adaptive Tutoring System for Machine Tasks using Augmented RealityModern manufacturing processes are in a state of flux, as they adapt to increasing demand for flexible and self-configuring production. This poses challenges for training workers to rapidly master new machine operations and processes, i.e. machine tasks. Conventional in-person training is effective but requires time and effort of experts for each worker trained and not scalable. Recorded tutorials, such as video-based or augmented reality (AR), permit more efficient scaling. However, unlike in-person tutoring, existing recorded tutorials lack the ability to adapt to workers' diverse experiences and learning behaviors. We present AdapTutAR, an adaptive task tutoring system that enables experts to record machine task tutorials via embodied demonstration and train learners with different AR tutoring contents adapting to each user's characteristics. The adaptation is achieved by continually monitoring learners' tutorial-following status and adjusting the tutoring content on-the-fly and in-situ. The results of our user study evaluation have demonstrated that our adaptive system is more effective and preferable than the non-adaptive one.2021GHGaoping Huang et al.Purdue UniversityAR Navigation & Context AwarenessIntelligent Tutoring Systems & Learning AnalyticsCHI
CAPturAR: An Augmented Reality Tool for Authoring Human-Involved Context-Aware ApplicationsRecognition of human behavior plays an important role in context-aware applications. However, it is still a challenge for end-users to build personalized applications that accurately recognize their own activities. Therefore, we present CAPturAR, an in-situ programming tool that supports users to rapidly author context-aware applications by referring to their previous activities. We customize an AR head-mounted device with multiple camera systems that allow for non-intrusive capturing of user’s daily activities. During authoring, we reconstruct the captured data in AR with an animated avatar and use virtual icons to represent the surrounding environment. With our visual programming interface, users create human-centered rules for the applications and experience them instantly in AR. We further demonstrate four use cases enabled by CAPturAR. Also, we verify the effectiveness of the AR-HMD and the authoring workflow with a system evaluation using our prototype. Moreover, we conduct a remote user study in an AR simulator to evaluate the usability.2020TWTianyi Wang et al.Human Pose & Activity RecognitionAR Navigation & Context AwarenessMixed Reality WorkspacesUIST
An Exploratory Study of Augmented Reality Presence for Tutoring Machine Tasks[object Object]2020YCYuanzhi Cao et al.Purdue UniversityMixed Reality WorkspacesTeleoperation & TelepresenceCHI
Vipo: Spatial-Visual Programming with Functions for Robot-IoT WorkflowsMobile robots and IoT (Internet of Things) devices can increase productivity, but only if they can be programmed by workers who understand the domain. This is especially true in manufacturing. Visual programming in the spatial context of the operating environment can enable mental models at a familiar level of abstraction. However, spatial-visual programming is still in its infancy; existing systems lack IoT integration and fundamental constructs, such as functions, that are essential for code reuse, encapsulation, or recursive algorithms. We present Vipo, a spatial-visual programming system for robot-IoT workflows. Vipo was designed with input from managers at six factories using mobile robots. Our user study (n=22) evaluated efficiency, correctness, comprehensibility of spatial-visual programming with functions.2020GHGaoping Huang et al.Purdue UniversityUbiquitous ComputingHuman-Robot Collaboration (HRC)CHI