Articulation Work and Tinkering for Fairness in Machine LearningThe field of fair AI aims to counter biased algorithms through computational modelling. However, it faces increasing criticism for perpetuating the use of overly technical and reductionist methods. As a result, novel approaches appear in the field to address more socially-oriented and interdisciplinary (SOI) perspectives on fair AI. In this paper, we take this dynamic as the starting point to study the tension between computer science (CS) and SOI research. By drawing on STS and CSCW theory, we position fair AI research as a matter of `organizational alignment’: what makes research `doable’ is the successful alignment of three levels of work organization (the social world, the laboratory and the experiment). Based on qualitative interviews with CS researchers, we analyze the tasks, resources, and actors required for doable research in the case of fair AI. We find that CS researchers engage with SOI to some extent, but organizational conditions, articulation work, and ambiguities of the social world constrain the doability of SOI research. Based on our findings, we identify and discuss problems for aligning CS and SOI as fair AI continues to evolve.2024MFMiriam Fahimi et al.Session 1g: Contextualizing Fairness in AICSCW
Shock Me The Way: Directional Electrotactile Feedback under the Smartwatch as a Navigation Aid for CyclistsCycling navigation is a complex and stressful task as the cyclist needs to focus simultaneously on the navigation, the road, and other road users. We propose directional electrotactile feedback at the wrist to reduce the auditory and visual load during navigation-aided cycling. We designed a custom electrotactile grid with 9 electrodes that is clipped under a smartwatch. In a preliminary study we identified suitable calibration settings and gained first insights about a suitable electrode layout. In a subsequent laboratory study we showed that a direction can be encoded with a mean error of 19.28\,° (\textsigma~=~42.77\,°) by combining 2 adjacent electrodes. Additionally, by interpolating with 3 electrodes a direction can be conveyed with a similar mean error of 22.54\,° (\textsigma~=~43.57\,°). We evaluated our concept of directional electrotactile feedback for cyclists in an outdoor study, in which 98.8\,\% of all junctions were taken correctly by eight study participants. Only one participant deviated substantially from the optimal path, but was successfully navigated back to the original route by our system.2024TDTim Duente et al.Vibrotactile Feedback & Skin StimulationFoot & Wrist InteractionMobileHCI
WorkFit: Designing Proactive Voice Assistance for the Health and Well-Being of Knowledge WorkersPrior research has designed and evaluated Voice Assistance (VA) for different settings such as the home, school, and public spaces. Office environments have been relatively understudied, leaving a gap in understanding the essential factors for designing a VA specifically for work settings. In this study, we developed the `WorkFit' VA specific for the office environment, focusing on the health and well-being of knowledge workers. WorkFit was designed to monitor knowledge workers for sedentary behavior, inconsistent hydration, and stress, and to deliver proactive voice interventions followed by a health recommendation to mitigate those issues. We evaluated WorkFit in a field study with 15 knowledge workers for 5 working days. In the study, we determined challenges and opportunities for voice interactions in work settings. We identified contextual factors for identifying inopportune moments for voice interactions in an office setting. We found that 92\% of knowledge workers accepted WorkFit's hydration interventions while 79\% of them engaged in walking breaks. Moreover, breathing exercises recommended by WorkFit significantly stabilized the heart rate of knowledge workers during stress. Based on our findings, we propose five design recommendations for the development of VA customized to office settings.2024SAShashank Ahire et al.Voice User Interface (VUI) DesignMental Health Apps & Online Support CommunitiesWorkplace Wellbeing & Work StressCUI
A Comparative Long-Term Study of Fallback Authentication SchemesFallback authentication, the process of re-establishing access to an account when the primary authenticator is unavailable, holds critical significance. Approaches range from secondary channels like email and SMS to personal knowledge questions (PKQs) and social authentication. A key difference to primary authentication is that the duration between enrollment and authentication can be much longer, typically months or years. However, few systems have been studied over extended timeframes, making it difficult to know how well these systems truly help users recover their accounts. We also lack meaningful comparisons of schemes as most prior work examined two mechanisms at most. We report the results of a long-term user study of the usability of fallback authentication over 18 months to provide a fair comparison of the four most commonly used fallback authentication methods. We show that users prefer email and SMS-based methods, while mechanisms based on PKQs and trustees lag regarding successful resets and convenience.2024LLLeona Lassak et al.Ruhr University BochumPasswords & AuthenticationPrivacy Perception & Decision-MakingCHI
Analyzing Security and Privacy Advice During the 2022 Russian Invasion of Ukraine on TwitterThe Russian Invasion of Ukraine in 2022 resulted in a rapidly changing cyber threat environment globally and incentivized the sharing of security and privacy advice on social media. Previous research found a strong impact of online security advice on end-user behavior. Twitter is an important platform for sharing information in crises. We examined 306 tweets with security and privacy advice related to the Ukrainian war, and created a taxonomy of 224 unique pieces of advice in seven categories, targeted at individuals or organizations in Ukraine and elsewhere. While our findings include untargeted and generic advice known from previous research, we identify novel advice specific to the invasion, offers for individual consultation, and misinformation on security and privacy advice as a new threat. Our findings highlight the strengths and shortcomings of the security and privacy advice given online during the invasion and establish areas for improvements and future research.2024JSJuliane Schmüser et al.CISPAPrivacy by Design & User ControlPrivacy Perception & Decision-MakingOnline Harassment & Counter-ToolsCHI
Understanding Users' Interaction with Login NotificationsLogin notifications intend to inform users about sign-ins and help them protect their accounts from unauthorized access. Notifications are usually sent if a login deviates from previous ones, potentially indicating malicious activity. They contain information like the location, date, time, and device used to sign in. Users are challenged to verify whether they recognize the login (because it was them or someone they know) or to protect their account from unwanted access. In a user study, we explore users' comprehension, reactions, and expectations of login notifications. We utilize two treatments to measure users' behavior in response to notifications sent for a login they initiated or based on a malicious actor relying on statistical sign-in information. We find that users identify legitimate logins but need more support to halt malicious sign-ins. We discuss the identified problems and give recommendations for service providers to ensure usable and secure logins for everyone.2024PMPhilipp Markert et al.Ruhr University BochumPrivacy by Design & User ControlPasswords & AuthenticationCHI
Can You Ear Me? — A Comparison of Different Private and Public Notification Channels for the Earlobe"The earlobe is a well-known location for wearing jewelry, but might also be promising for electronic output, such as presenting notifications. This work elaborates the pros and cons of different notification channels for the earlobe. Notifications on the earlobe can be private (only noticeable by the wearer) as well as public (noticeable in the immediate vicinity in a given social situation). A user study with 18 participants showed that the reaction times for the private channels (Poke, Vibration, Private Sound, Electrotactile) were on average less than 1 s with an error rate (missed notifications) of less than 1 %. Thermal Warm and Cold took significantly longer and Cold was least reliable (26 % error rate). The participants preferred Electrotactile and Vibration. Among the public channels the recognition time did not differ significantly between Sound (738 ms) and LED (828 ms), but Display took much longer (3175 ms). At 22 % the error rate of Display was highest. The participants generally felt comfortable wearing notification devices on their earlobe. The results show that the earlobe indeed is a suitable location for wearable technology, if properly miniaturized, which is possible for Electrotactile and LED. We present application scenarios and discuss design considerations. A small field study in a fitness center demonstrates the suitability of the earlobe notification concept in a sports context." https://doi.org/10.1145/36109252023DSDennis Stanke et al.In-Vehicle Haptic, Audio & Multimodal FeedbackVibrotactile Feedback & Skin StimulationHaptic WearablesUbiComp
Privacy-Enhancing Technology and Everyday Augmented Reality: Understanding Bystanders’ Varying Needs for Awareness and ConsentFundamental to Augmented Reality (AR) headsets is their capacity to visually and aurally sense the world around them, necessary to drive the positional tracking that makes rendering 3D spatial content possible. This requisite sensing also opens the door for more advanced AR-driven activities, such as augmented perception, volumetric capture and biometric identification - activities with the potential to expose bystanders to significant privacy risks. Existing Privacy-Enhancing Technologies (PETs) often safeguard against these risks at a low level e.g., instituting camera access controls. However, we argue that such PETs are incompatible with the need for always-on sensing given AR headsets' intended everyday use. Through an online survey (N=102), we examine bystanders' awareness of, and concerns regarding, potentially privacy infringing AR activities; the extent to which bystanders' consent should be sought; and the level of granularity of information necessary to provide awareness of AR activities to bystanders. Our findings suggest that PETs should take into account the AR activity type, and relationship to bystanders, selectively facilitating awareness and consent. In this way, we can ensure bystanders feel their privacy is respected by everyday AR headsets, and avoid unnecessary rejection of these powerful devices by society. https://doi.org/10.1145/35695012023JOJoseph O’Hagan et al.Privacy by Design & User ControlSmart Home Privacy & SecurityUbiComp
52 Weeks Later: Attitudes Towards COVID-19 Apps for Different Purposes Over TimeThe COVID-19 pandemic has prompted countries around the world to introduce smartphone apps to support disease control efforts. Their purposes range from digital contact tracing to quarantine enforcement to vaccination passports, and their effectiveness often depends on widespread adoption. While previous work has identified factors that promote or hinder adoption, it has typically examined data collected at a single point in time or focused exclusively on digital contact tracing apps. In this work, we conduct the first representative study that examines changes in people’s attitudes towards COVID-19-related smartphone apps for five different purposes over the first 1.5 years of the pandemic. In three survey rounds conducted between Summer 2020 and Summer 2021 in the United States and Germany, with approximately 1,000 participants per round and country, we investigate people’s willingness to use such apps, their perceived utility, and people’s attitudes towards them in different stages of the pandemic. Our results indicate that privacy is a consistent concern for participants, even in a public health crisis, and the collection of identity-related data significantly decreases acceptance of COVID-19 apps. Trust in authorities is essential to increase confidence in government-backed apps and foster citizens’ willingness to contribute to crisis management. There is a need for continuous communication with app users to emphasize the benefits of health crisis apps both for individuals and society, thus counteracting decreasing willingness to use them and perceived usefulness as the pandemic evolves.2023MKMarvin Kowalewski et al.COVID-19 + CSCWCSCW
A World Full of Privacy and Security (Mis)conceptions? Findings of a Representative Survey in 12 CountriesMisconceptions about digital security and privacy topics in the general public frequently lead to insecure behavior. However, little is known about the prevalence and extent of such misconceptions in a global context. In this work, we present the results of the first large-scale survey of a global population on misconceptions: We conducted an online survey with n = 12,351 participants in 12 countries on four continents. By investigating influencing factors of misconceptions around eight common security and privacy topics (including E2EE, Wi-Fi, VPN, and malware), we find the country of residence to be the strongest estimate for holding misconceptions. We also identify differences between non-Western and Western countries, demonstrating the need for region-specific research on user security knowledge, perceptions, and behavior. While we did not observe many outright misconceptions, we did identify a lack of understanding and uncertainty about several fundamental privacy and security topics.2023FHFranziska Herbert et al.Ruhr University BochumPrivacy by Design & User ControlPrivacy Perception & Decision-MakingCybersecurity Training & AwarenessCHI
Home Is Where the Smart Is: Development and Validation of the Cybersecurity Self-Efficacy in Smart Homes (CySESH) ScaleThe ubiquity of devices connected to the internet raises concerns about the security and privacy of smart homes. The effectiveness of interventions to support secure user behaviors is limited by a lack of validated instruments to measure relevant psychological constructs, such as self-efficacy - the belief that one is able to perform certain behaviors. We developed and validated the Cybersecurity Self-Efficacy in Smart Homes (CySESH) scale, a 12-item unidimensional measure of domain-specific self-efficacy beliefs, across five studies (N=1247). Three pilot studies generated and refined an item pool. We report evidence from one initial and one major, preregistered validation study for (1) excellent reliability (𝛼=0.90), (2) convergent validity with self-efficacy in information security (𝑟SEIS=0.64, p<.001), and (3) discriminant validity with outcome expectations (𝑟OE=0.26, p<.001), self-esteem (𝑟RSE=0.17, p<.001), and optimism (𝑟LOT−R=0.18, p<.001). We discuss CySESH's potential to advance future HCI research on cybersecurity, practitioner user assessments, and implications for consumer protection policy.2023NBNele Borgert et al.Ruhr University Bochum, Ruhr University BochumPrivacy by Design & User ControlSmart Home Privacy & SecurityCHI
Improving Worker Engagement Through Conversational Microtask CrowdsourcingThe rise in popularity of conversational agents has enabled humans to interact with machines more naturally. Recent work has shown that crowd workers in microtask marketplaces can complete a variety of human intelligence tasks (HITs) using conversational interfaces with similar output quality compared to the traditional Web interfaces. In this paper, we investigate the effectiveness of using conversational interfaces to improve worker engagement in microtask crowdsourcing. We designed a text-based conversational agent that assists workers in task execution, and tested the performance of workers when interacting with agents having different conversational styles. We conducted a rigorous experimental study on Amazon Mechanical Turk with 800 unique workers, to explore whether the output quality, worker engagement and the perceived cognitive load of workers can be affected by the conversational agent and its conversational styles. Our results show that conversational interfaces can be effective in engaging workers, and a suitable conversational style has potential to improve worker engagement.2020SQSihang Qiu et al.Delft University of TechnologyConversational ChatbotsCrowdsourcing Task Design & Quality ControlCHI
Listen to Developers! A Participatory Design Study on Security Warnings for Cryptographic APIsThe positive effect of security information communicated to developers through API warnings has been established. However, current prototypical designs are based on security warnings for end-users. To improve security feedback for developers, we conducted a participatory design study with 25 professional software developers in focus groups. We identify which security information is considered helpful in avoiding insecure cryptographic API use during development. Concerning console messages, participants suggested five core elements, namely message classification, title message, code location, link to detailed external resources, and color. Design guidelines for end-user warnings are only partially suitable in this context. Participants emphasized the importance of tailoring the detail and content of security information to the context. Console warnings call for concise communication; further information needs to be linked externally. Therefore, security feedback should transcend tools and should be adjustable by software developers across development tools, considering the work context and developer needs.2020PGPeter Leo Gorski et al.TH Köln/University of Applied SciencesDark Patterns RecognitionCHI
Vibrotactile Funneling Illusion and Localization Performance on the HeadThe vibrotactile funneling illusion is the sensation of a single (non-existing) stimulus somewhere in-between the actual stimulus locations. Its occurrence depends upon body location, distance between the actuators, signal synchronization, and intensity. Related work has shown that the funneling illusion may occur on the forehead. We were able to reproduce these findings and explored five further regions to get a more complete picture of the occurrence of the funneling illusion on the head. The results of our study (24 participants) show that the actuator distance, for which the funneling illusion occurs, strongly depends upon the head region. Moreover, we evaluated the centralizing bias (smaller perceived than actual actuator distances) for different head regions, which also showed widely varying characteristics. We computed a detailed heat map of vibrotactile localization accuracies on the head. The results inform the design of future tactile head-mounted displays that aim to support the funneling illusion.2020OKOliver Beren Kaul et al.Leibniz University HannoverVibrotactile Feedback & Skin StimulationCHI
Understanding and Mitigating Worker Biases in the Crowdsourced Collection of Subjective JudgmentsCrowdsourced data acquired from tasks that comprise a subjective component (e.g. opinion detection, sentiment analysis) is potentially affected by the inherent bias of crowd workers who contribute to the tasks. This can lead to biased and noisy ground-truth data, propagating the undesirable bias and noise when used in turn to train machine learning models or evaluate systems. In this work, we aim to understand the influence of workers' own opinions on their performance in the subjective task of bias detection. We analyze the influence of workers' opinions on their annotations corresponding to different topics. Our findings reveal that workers with strong opinions tend to produce biased annotations. We show that such bias can be mitigated to improve the overall quality of the data collected. Experienced crowd workers also fail to distance themselves from their own opinions to provide unbiased annotations.2019CHChristoph Hube et al.Leibniz Universtät HannoverCrowdsourcing Task Design & Quality ControlAlgorithmic Fairness & BiasCHI
Dissonance Between Human and Machine UnderstandingComplex machine learning models are deployed in several critical domains including healthcare and autonomous vehicles nowadays, albeit as functional blackboxes. Consequently, there has been a recent surge in interpreting decisions of such complex models in order to explain their actions to humans. Models which correspond to human interpretation of a task are more desirable in certain contexts and can help attribute liability, build trust, expose biases and in turn build better models. It is therefore crucial to understand how and which models conform to human understanding of tasks. In this paper we present a large-scale crowdsourcing study that reveals and quantifies the dissonance between human and machine understanding, through the lens of an image classification task. In particular, we seek to answer the following questions: Which (well performing) complex ML models are closer to humans in their use of features to make accurate predictions? How does task difficulty affect the feature selection capability of machines in comparison to humans? Are humans consistently better at selecting features that make image recognition more accurate? Our findings have important implications on human-machine collaboration, considering that a long term goal in the field of artificial intelligence is to make machines capable of learning and reasoning like humans.2019ZZHaimo Zhang et al.Human–machine configurationsCSCW
Using Worker Self-Assessments for Competence-based Pre-Selection in Crowdsourcing MicrotasksPaid crowdsourcing platforms have evolved into remarkable marketplaces where requesters can tap into human intelligence to serve a multitude of purposes, and the workforce can benefit through monetary returns for investing their efforts. In this work, we focus on individual crowd worker competencies. By drawing from self-assessment theories in psychology, we show that crowd workers often lack awareness about their true level of competence. Due to this, although workers intend to maintain a high reputation, they tend to participate in tasks that are beyond their competence. We reveal the diversity of individual worker competencies, and make a case for competence-based pre-selection in crowdsourcing marketplaces. We show the implications of flawed self-assessments on real-world microtasks, and propose a novel worker pre-selection method that considers accuracy of worker self-assessments. We evaluated our method in a sentiment analysis task and observed an improvement in the accuracy by over 15%, when compared to traditional performance-based worker pre-selection. Similarly, our proposed method resulted in an improvement in accuracy of nearly 6% in an image validation task. Our results show that requesters in crowdsourcing platforms can benefit by considering worker self-assessments in addition to their performance for pre-selection.2018UGUjwal Gadiraju et al.Leibniz Universitat HannoverCrowdsourcing Task Design & Quality ControlCHI