The Promises and Perils of using LLMs for Effective Public ServicesGovernments are the primary providers of essential public services and are responsible for delivering them effectively. In high-stakes decision-making domains such as child welfare (CW), agencies must protect children without unnecessarily prolonging a family’s engagement with the system. With growing optimism around AI, governments are pushing for its integration but concerns regarding feasibility and harms remain. Through collaborations with a large Canadian CW agency, we examined how LocalLLM and BERTopic models can track CW case progress. We demonstrate how the tools can potentially assist workers in opportunistically addressing gaps in their work by signaling case progress/deviations. And yet, we also show how they fail to detect case trajectories that require discretionary judgments grounded in social work training, areas where practitioners would actually want support to pre-emptively address substantive case concerns. We also provide a roadmap of future participatory directions to co-design language tools for/with the public sector.2026EMErina Seh-Young Moon et al.University of TorontoHuman-LLM CollaborationAI-Assisted Decision-Making & AutomationParticipatory DesignCHI
The Datafication of Care in Public Homelessness ServicesHomelessness systems in North America adopt coordinated data-driven approaches to efficiently match support services to clients based on their assessed needs and available resources. AI tools are increasingly being implemented to allocate resources, reduce costs and predict risks in this space. In this study, we conducted an ethnographic case study on the City of Toronto’s homelessness system’s data practices across different critical points. We show how the City’s data practices offer standardized processes for client care but frontline workers also engage in heuristic decision-making in their work to navigate uncertainties, client resistance to sharing information, and resource constraints. From these findings, we show the temporality of client data which constrain the validity of predictive AI models. Additionally, we highlight how the City adopts an iterative and holistic client assessment approach which contrasts to commonly used risk assessment tools in homelessness, providing future directions to design holistic decision-making tools for homelessness.2025EMErina Seh-Young Moon et al.University of TorontoEmpowerment of Marginalized GroupsUser Research Methods (Interviews, Surveys, Observation)CHI
Towards a Non-Ideal Methodological Framework for Responsible MLThough ML practitioners increasingly employ various Responsible ML (RML) strategies, their methodological approach in practice is still unclear. In particular, the constraints, assumptions, and choices of practitioners with technical duties--such as developers, engineers, and data scientists---are often implicit, subtle, and under-scrutinized in HCI and related fields. We interviewed 22 technically oriented ML practitioners across seven domains to understand the characteristics of their methodological approaches to RML through the lens of ideal and non-ideal theorizing of fairness. We find that practitioners’ methodological approaches fall along a spectrum of idealization. While they structured their approaches through ideal theorizing, such as by abstracting RML workflow from the inquiry of applicability of ML, they did not systematically document nor pay deliberate attention to their non-ideal approaches, such as diagnosing imperfect conditions. We end our paper with a discussion of a new methodological approach, inspired by elements of non-ideal theory, to structure technical practitioners’ RML process and facilitate collaboration with other stakeholders.2024RMRamaravind Kommiya Mothilal et al.University of TorontoAI Ethics, Fairness & AccountabilityAlgorithmic Fairness & BiasCHI
"This is not a data problem": Algorithms and Power in Public Higher Education in CanadaAlgorithmic decision-making is increasingly being adopted across public higher education. The expansion of data-driven practices by post-secondary institutions has occurred in parallel with the adoption of New Public Management approaches by neoliberal administrations. In this study, we conduct a qualitative analysis of an in-depth ethnographic case study of data and algorithms in use at a public college in Ontario, Canada. We identify the data, algorithms, and outcomes in use at the college. We assess how the college's processes and relationships support those outcomes and the different stakeholders' perceptions of the college's data-driven systems. In addition, we find that the growing reliance on algorithmic decisions leads to increased student surveillance, exacerbation of existing inequities, and the automation of the faculty-student relationship. Finally, we identify a cycle of increased institutional power perpetuated by algorithmic decision-making, and driven by a push towards financial sustainability.2024KMKelly McConvey et al.University of TorontoAI Ethics, Fairness & AccountabilityAlgorithmic Transparency & AuditabilityResearch Ethics & Open ScienceCHI
Are We Asking the Right Questions?: Designing for Community Stakeholders’ Interactions with AI in PolicingResearch into recidivism risk prediction in the criminal justice system has garnered significant attention from HCI, critical algorithm studies, and the emerging field of human-AI decision-making. This study focuses on algorithmic crime mapping, a prevalent yet underexplored form of algorithmic decision support (ADS) in this context. We conducted experiments and follow-up interviews with 60 participants, including community members, technical experts, and law enforcement agents (LEAs), to explore how lived experiences, technical knowledge, and domain expertise shape interactions with the ADS, impacting human-AI decision-making. Surprisingly, we found that domain experts (LEAs) often exhibited anchoring bias, readily accepting and engaging with the first crime map presented to them. Conversely, community members and technical experts were more inclined to engage with the tool, adjust controls, and generate different maps. Our findings highlight that all three stakeholders were able to provide critical feedback regarding AI design and use - community members questioned the core motivation of the tool, technical experts drew attention to the elastic nature of data science practice, and LEAs suggested redesign pathways such that the tool could complement their domain expertise.2024MHMd Romael Haque et al.Marquette UniversityAI-Assisted Decision-Making & AutomationAI Ethics, Fairness & AccountabilityContent Moderation & Platform GovernanceCHI
A Human-Centered Review of Algorithms in Homelessness Research Homelessness is a humanitarian challenge affecting an estimated 1.6 billion people worldwide. In the face of rising homeless populations in developed nations and a strain on social services, government agencies are increasingly adopting data-driven models to determine one’s risk of experiencing homelessness and assigning scarce resources to those in need. We conducted a systematic literature review of 57 papers to understand the evolution of these decision-making algorithms. We investigated trends in computational methods, predictor variables, and target outcomes used to develop the models using a human-centered lens and found that only 9 papers (15.7%) investigated model fairness and bias. We uncovered tensions between explainability and ecological validity wherein predictive risk models (53.4%) unduly focused on reductive explainability while resource allocation models (25.9%) were dependent on unrealistic assumptions and simulated data that are not useful in practice. Further, we discuss research challenges and opportunities for developing human-centered algorithms in this area.2024EMErina Seh-Young Moon et al.University of TorontoAI Ethics, Fairness & AccountabilityAlgorithmic Fairness & BiasCHI
Charting the COVID Long Haul Experience - A Longitudinal Exploration of Symptoms, Activity, and Clinical AdherenceCOVID Long Haul (CLH) is an emerging chronic illness with varied patient experiences. Our understanding of CLH is often limited to data from electronic health records (EHRs), such as diagnoses or problem lists, which do not capture the volatility and severity of symptoms or their impact. To better understand the unique presentation of CLH, we conducted a 3-month long cohort study with 14 CLH patients, collecting objective (EHR, daily Fitbit logs) and subjective (weekly surveys, interviews) data. Our findings reveal a complex presentation of symptoms, associated uncertainty, and the ensuing impact CLH has on patients' personal and professional lives. We identify patient needs, practices, and challenges around adhering to clinical recommendations, engaging with health data, and establishing "new normals" post COVID. We reflect on the potential found at the intersection of these various data streams and the persuasive heuristics possible when designing for this new population and their specific needs.2024JPJessica Pater et al.Parkview HealthChronic Disease Self-Management (Diabetes, Hypertension, etc.)CHI
The ``Colonial Impulse" of Natural Language Processing: An Audit of Bengali Sentiment Analysis Tools and Their Identity-based BiasesWhile colonization has sociohistorically impacted people's identities across various dimensions, those colonial values and biases continue to be perpetuated by sociotechnical systems. One category of sociotechnical systems--sentiment analysis tools--can also perpetuate colonial values and bias, yet less attention has been paid to how such tools may be complicit in perpetuating coloniality, although they are often used to guide various practices (e.g., content moderation). In this paper, we explore potential bias in sentiment analysis tools in the context of Bengali communities who have experienced and continue to experience the impacts of colonialism. Drawing on identity categories most impacted by colonialism amongst local Bengali communities, we focused our analytic attention on gender, religion, and nationality. We conducted an algorithmic audit of all sentiment analysis tools for Bengali, available on the Python package index (PyPI) and GitHub. Despite similar semantic content and structure, our analyses showed that in addition to inconsistencies in output from different tools, Bengali sentiment analysis tools exhibit bias between different identity categories and respond differently to different ways of identity expression. Connecting our findings with colonially shaped sociocultural structures of Bengali communities, we discuss the implications of downstream bias of sentiment analysis tools.2024DDDipto Das et al.University of Colorado BoulderAI Ethics, Fairness & AccountabilityAlgorithmic Fairness & BiasCHI
Social Media is not a Health Proxy: Differences Between Social Media and Electronic Health Record Reports of Post-COVID SymptomsThe COVID-19 pandemic transformed many aspects of health and daily life. A subset of people who were infected with the virus have ongoing chronic health issues that range in type of symptom and severity. In this study, we conducted a qualitative assessment of self-reported post-COVID symptoms from patients’ electronic health records (EHR, n=564) and a randomized collection of Reddit and Twitter posts (n=500 for each). We show the inconsistencies in what types of symptoms are shared between platforms in addition to assessing the severity of the symptoms and how social media characterizations of post-COVID do not tell a complete story of this phenomenon. This research contributes to CSCW health literature by connecting digital traces of post-COVID with EHR data, critiquing the use of social media as a health proxy and points to its potential to add context to the analysis of traditional health data extracted from the EHR.2023JPJessica Pater et al.Health InformationCSCW
Rethinking "Risk" in Algorithmic Systems Through A Computational Narrative Analysis of Casenotes in Child WelfareRisk assessment algorithms are being adopted by public sector agencies to make high-stakes decisions about human lives. Algorithms model “risk” based on individual client characteristics to identify clients most in need. However, this understanding of risk is primarily based on easily quantifiable risk factors that present an incomplete and biased perspective of clients. We conducted a computational narrative analysis of child-welfare casenotes and draw attention to deeper systemic risk factors that are hard to quantify but directly impact families and street-level decision-making. We found that beyond individual risk factors, the system itself poses a significant amount of risk where parents are over-surveilled by caseworkers and lack agency in decision-making processes. We also problematize the notion of risk as a static construct by highlighting the temporality and mediating effects of different risk, protective, systemic, and procedural factors. Finally, we draw caution against using casenotes in NLP-based systems by unpacking their limitations and biases embedded within them.2023DSDevansh Saxena et al.Marquette UniversityAI Ethics, Fairness & AccountabilityAlgorithmic Fairness & BiasEmpowerment of Marginalized GroupsCHI
A Human-Centered Review of Algorithms for Decision-Making in Higher EducationThe use of algorithms for decision-making in higher education is steadily growing, promising cost-savings to institutions and personalized service for students but also raising ethical challenges around surveillance, fairness, and interpretation of data. To address the lack of systematic understanding of how these algorithms are currently designed, we reviewed an extensive corpus of papers proposing algorithms for decision-making in higher education. We categorized them based on input data, computational method, and target outcome, and then investigated the interrelations of these factors with the application of human-centered lenses: theoretical, participatory, or speculative design. We found that the models are trending towards deep learning, and increased use of student personal data and protected attributes, with the target scope expanding towards automated decisions. However, despite the associated decrease in interpretability and explainability, current development predominantly fails to incorporate human-centered lenses. We discuss the challenges with these trends and advocate for a human-centered approach.2023KMKelly McConvey et al.University of TorontoAI-Assisted Decision-Making & AutomationAI Ethics, Fairness & AccountabilityAlgorithmic Transparency & AuditabilityCHI
Uncovering Adverse Childhood Experiences (ACEs) from Clinical Narratives within the Electronic Health RecordAdverse Childhood Events (ACEs) are potentially traumatic events that occur in childhood (e.g. sexual abuse and maternal violence). Clinical research highlights the significant impact ACES have on youth’s mental health similar to other youth-related issues like traditional bullying and cyberbullying. However, research focused on the intersection of these two are limited. We report the results from a qualitative study that used electronic health record (EHR) data and clinical narratives from Parkview Behavioral Health hospital (n=719) to better understand the presentation of ACEs in patients who indicated cyber/bullying contributed to their inpatient hospital admission. Our deductive thematic analyses on the clinical narratives/notes and diagnoses highlight the connection of ACEs with cyber/bullying and other clinical diagnoses like depression, anxiety, PTSD, and ADD/ADHD. Additionally, our results point to potential impacts of the gender spectrum and other non-ACE indicators like adoption and the need for Department of Child Services (DCS). The outcome of this study provides distinct computational and clinical design guidelines for better collaborative decision making in healthcare, including the need for ACEs screening as standard-of-care within acute mental health settings. CAUTION: This paper includes graphic contents about adverse childhood traumas and events.2022FNFayika Farhat Nova et al.Health Technologies; Health TechnologiesCSCW
Unpacking Invisible Work Practices, Constraints, and Latent Power Relationships in Child Welfare through Casenote AnalysisCaseworkers are trained to write detailed narratives about families in Child-Welfare (CW) which informs collaborative high-stakes decision-making. Unlike other administrative data, these narratives offer a more credible source of information with respect to workers’ interactions with families as well as underscore the role of systemic factors in decision-making. SIGCHI researchers have emphasized the need to understand human discretion at the street-level to be able to design human-centered algorithms for the public sector. In this study, we conducted computational text analysis of casenotes at a child-welfare agency in the midwestern United States and highlight patterns of invisible street-level discretionary work and latent power structures that have direct implications for algorithm design. Casenotes offer a unique lens for policymakers and CW leadership towards understanding the experiences of on-the-ground caseworkers. As a result of this study, we highlight how street-level discretionary work needs to be supported by sociotechnical systems developed through worker-centered design. This study offers the first computational inspection of casenotes and introduces them to the SIGCHI community as a critical data source for studying sociotechnical systems.2022DSDevansh Saxena et al.Marquette UniversityEmpowerment of Marginalized GroupsParticipatory DesignUser Research Methods (Interviews, Surveys, Observation)CHI