Story-Driven: Exploring the Impact of Providing Real-time Context Information on Automated StorytellingStories have long captivated the human imagination with narratives that enrich our lives. Traditional storytelling methods are often static and not designed to adapt to the listener’s environment, which is full of dynamic changes. For instance, people often listen to stories in the form of podcasts or audiobooks while traveling in a car. Yet, conventional in-car storytelling systems do not embrace the adaptive potential of this space. The advent of generative AI is the key to creating content that is not just personalized but also responsive to the changing parameters of the environment. We introduce a novel system for interactive, real-time story narration that leverages environment and user context in correspondence with estimated arrival times to adjust the generated story continuously. Through two comprehensive real-world studies with a total of 30 participants in a vehicle, we assess the user experience, level of immersion, and perception of the environment provided by the prototype. Participants' feedback shows a significant improvement over traditional storytelling and highlights the importance of context information for generative storytelling systems.2024JBJan Henry Belz et al.AR Navigation & Context AwarenessGenerative AI (Text, Image, Music, Video)Interactive Narrative & Immersive StorytellingUIST
Technical Design Space Analysis for Unobtrusive Driver Emotion Assessment Using Multi-Domain ContextDriver emotions play a vital role in driving safety and performance. Consequently, regulating driver emotions through empathic interfaces have been investigated thoroughly. However, the prerequisite - driver emotion sensing - is a challenging endeavor: Body-worn physiological sensors are intrusive, while facial and speech recognition only capture overt emotions. In a user study (N=27), we investigate how emotions can be unobtrusively predicted by analyzing a rich set of contextual features captured by a smartphone, including road and traffic conditions, visual scene analysis, audio, weather information, and car speed. We derive a technical design space to inform practitioners and researchers about the most indicative sensing modalities, the corresponding impact on users' privacy, and the computational cost associated with processing this data. Our analysis shows that contextual emotion recognition is significantly more robust than facial recognition, leading to an overall improvement of 7% using a leave-one-participant-out cross-validation. https://dl.acm.org/doi/10.1145/35694662023DBDavid Bethge et al.Automated Driving Interface & Takeover DesignPrivacy by Design & User ControlContext-Aware ComputingUbiComp
HandyCast: Phone-based Bimanual Input for Virtual Reality in Mobile and Space-Constrained Settings via Pose-and-Touch TransferDespite the potential of Virtual Reality as the next computing platform for general purposes, current systems are tailored to stationary settings to support expansive interaction in mid-air. However, in mobile scenarios, the physical constraints of the space surrounding the user may be prohibitively small for spatial interaction in VR with classical controllers. In this paper, we present HandyCast, a smartphone-based input technique that enables full-range 3D input with two virtual hands in VR while requiring little physical space, allowing users to operate large virtual environments in mobile settings. HandyCast defines a pose-and-touch transfer function that fuses the phone's position and orientation with touch input to derive two individual 3D hand positions. Holding their phone like a gamepad, users can thus move and turn it to independently control their virtual hands. Touch input using the thumbs fine-tunes the respective virtual hand position and controls object selection. We evaluated HandyCast in three studies, comparing its performance with that of Go-Go, a classic bimanual controller technique. In our open-space study, participants required significantly less physical motion using HandyCast with no decrease in completion time or body ownership. In our space-constrained study, participants achieved significantly faster completion times, smaller interaction volumes, and shorter path lengths with HandyCast compared to Go-Go. In our technical evaluation, HandyCast's fully standalone inside-out 6D tracking performance again incurred no decrease in completion time compared to an outside-in tracking baseline.2023MKMohamed Kari et al.ETH Zürich, Porsche AGFull-Body Interaction & Embodied InputMixed Reality WorkspacesCHI
HMInference: Inferring Multimodal HMI Interactions in Automotive ScreensDriving requires high cognitive capabilities in which drivers need to be able to focus on first-level driving tasks. However, each interaction with the User Interface (UI) system presents a potential distraction. Designing UIs based on insights from field-collected user interaction logs, as well as real-time estimation of the most probable interaction modality, can contribute to engineering focus-supporting UIs. However, the question arises of how user interactions can be predicted in in-the-wild driving scenarios. In this paper, we present HMInference, an automotive machine-learning framework which exploits user interaction log data. HMInference analyzes the interaction sequences of users based on UI domains (e.g., navigation, media, settings) and driving context (e.g.,vehicle trajectory) to predict different interaction modalities (e.g., touch, speech). In 10-fold cross-validation, HMInference achieves a mean accuracy of 73.2% (SD:0.02). Our work advances areas where user interaction prediction for in-car scenarios is required e.g., to enable adaptive system designs.2021JWJannik Wolf et al.Head-Up Display (HUD) & Advanced Driver Assistance Systems (ADAS)AI-Assisted Decision-Making & AutomationAutoUI
VEmotion: Using Driving Context for Indirect Emotion Prediction in Real-TimeDetecting emotions while driving remains a challenge in Human-Computer Interaction. Current methods to estimate the driver's experienced emotions use physiological sensing (e.g., skin-conductance, electroencephalography), speech, or facial expressions. However, drivers need to use wearable devices, perform explicit voice interaction, or require robust facial expressiveness. We present VEmotion (Virtual Emotion Sensor), a novel method to predict driver emotions in an unobtrusive way using contextual smartphone data. VEmotion analyzes information including traffic dynamics, environmental factors, in-vehicle context, and road characteristics to implicitly classify driver emotions. We demonstrate the applicability in a real-world driving study (N=12) to evaluate the emotion prediction performance. Our results show that VEmotion outperforms facial expressions by 29% in a person-dependent classification and by 8.5% in a person-independent classification. We discuss how VEmotion enables empathic car interfaces to sense the driver's emotions and will provide in-situ interface adaptations on-the-go.2021DBDavid Bethge et al.Head-Up Display (HUD) & Advanced Driver Assistance Systems (ADAS)Motion Sickness & Passenger ExperienceUIST
SoundsRide: Affordance-Synchronized Music Mixing for In-Car Audio Augmented RealityMusic is a central instrument in video gaming to attune a player's attention to the current atmosphere and increase their immersion in the game. We transfer the idea of scene-adaptive music to car drives and propose SoundsRide, an in-car audio augmented reality system that mixes music in real-time synchronized with sound affordances along the ride. After exploring the design space of affordance-synchronized music, we design SoundsRide to temporally and spatially align high-contrast events on the route, e.g., highway entrances or tunnel exits, with high-contrast events in music, e.g., song transitions or beat drops, for any recorded and annotated GPS trajectory by a three-step procedure. In real-time, SoundsRide 1) estimates temporal distances to events on the route, 2) fuses these novel estimates with previous estimates in a cost-aware music-mixing plan, and 3) if necessary, re-computes an updated mix to be propagated to the audio output. To minimize user-noticeable updates to the mix, SoundsRide fuses new distance information with a filtering procedure that chooses the best updating strategy given the last music-mixing plan, the novel distance estimations, and the system parameterization. We technically evaluate SoundsRide and conduct a user evaluation with 8 participants to gain insights into how users perceive SoundsRide in terms of mixing, affordances, and synchronicity. We find that SoundsRide can create captivating music experiences and positively as well as negatively influence subjectively perceived driving safety, depending on the mix and user.2021MKMohamed Kari et al.In-Vehicle Haptic, Audio & Multimodal FeedbackGame UX & Player BehaviorUIST
SmartObjects: Sixth Workshop on Interacting with Smart ObjectsThe emergence of smart objects has the potential to radically change the way we interact with technology. Through embedded means for input and output, such objects allow for more natural and immediate interaction. The SmartObjects workshop will focus on how such embedded intelligence in objects situated in the user's physical environment can be used to provide more efficient and enjoyable interactions. We discuss the design from the technology and the user experience perspective.2018FMFlorian Müller et al.TU DarmstadtContext-Aware ComputingUbiquitous ComputingCHI