Story
Position: Beyond Sensitive Attributes, ML Fairness Should Quantify Structural Injustice via Social Determinants
Key takeaway
Researchers propose that AI fairness should focus on addressing systemic social inequalities, not just discrimination based on sensitive attributes. This could lead to more impactful approaches to mitigate unfairness in real-world AI systems.
Quick Explainer
The core idea is that machine learning fairness should move beyond just considering sensitive attributes like race and gender, and instead explicitly model and quantify structural injustice via social determinants. Social determinants are aspects of the surrounding context, like social practices and environments, that influence individuals' attributes and outcomes. This allows capturing unfairness that persists even when individual-level discrimination is addressed. The key is developing fairness metrics, causal models, and data frameworks that treat social determinants as explicit intervention targets, beyond just sensitive attributes. This offers a more comprehensive view of structural injustice compared to prevailing technical approaches.
Deep Dive
Technical Deep Dive
Problem and Context
This technical deep-dive examines research into how machine learning (ML) fairness approaches should move beyond focusing solely on sensitive attributes like race and gender, and instead explicitly model and quantify structural injustice via social determinants. The key points are:
- Algorithmic fairness research has largely framed unfairness as discrimination along sensitive attributes, but this approach limits visibility into unfairness as structural injustice instantiated through social determinants.
- Social determinants are variables representing aspects of the context (e.g., social practices, structures, environments) that directly influence individuals' attributes and outcomes, but are not intrinsic to any specific individual.
- Auditing unfairness as structural injustice via social determinants is essential, as it can persist even if individual-level discrimination has been addressed.
Methodology
The paper makes the following key arguments:
- Conceptual Gap: It identifies the gap between cross-disciplinary treatments of social determinants for structural injustice, and their limited engagement in ML fairness research.
- Paradigm Limits: It shows that prevailing sensitive-attribute-centered technical paradigms are not well suited to quantifying structural injustice via social determinants.
- Unintended Injustice: It theoretically demonstrates that mitigation strategies focused on sensitive attributes can introduce additional structural injustice.
- Empirical Evidence: It provides empirical evidence for the necessity and value of social-determinant-based analysis and intervention.
Data and Experimental Setup
The paper leverages several datasets:
- U.S. Census data (PUMAs): Used to analyze the demographics, socioeconomic status, and occupational structures across different geographic regions.
- De-identified patient records from an integrated healthcare system: Used to study differences in breast cancer screening uptake across regions with varying socioeconomic conditions.
- Semi-synthetic simulations: Used to model the impact of policy interventions targeting social determinants on breast cancer early detection outcomes.
Results
The key findings include:
- Existing practices and benchmarks in ML fairness often omit variables related to social determinants, limiting the ability to capture structural injustice.
- Sensitive attributes alone cannot fully capture the shifting influences from contextual environments on individuals' opportunities and outcomes.
- Sensitive-attribute-centered mitigation strategies like quota-based admissions can introduce new forms of structural injustice, particularly harming non-URM applicants from disadvantaged regions.
- Empirical analyses demonstrate significant heterogeneity in outcomes even among individuals with the same sensitive attributes, highlighting the importance of social determinants.
- Semi-synthetic simulations show that policy interventions targeting social determinants (e.g., improving screening access in disadvantaged regions) can yield non-trivial gains in early cancer detection.
Interpretation
The paper argues that addressing structural injustice in ML fairness requires a coordinated effort on three fronts:
- Data Governance: Establishing privacy-aware frameworks to leverage social determinant data while protecting sensitive identifiers.
- Metric Design: Developing dynamic, context-adaptable fairness metrics that track disparities across evolving structural conditions.
- Underlying Processes: Constructing causal models that treat social determinants as explicit intervention targets, beyond just sensitive attributes.
Limitations and Uncertainties
The paper acknowledges several limitations:
- The theoretical analyses rely on simplifying assumptions about the data generating process, which may not fully capture the complexities of real-world settings.
- The semi-synthetic simulations abstract from the full epidemiological processes underlying breast cancer onset and screening, which may involve additional factors beyond social determinants.
- The empirical analyses are limited to the specific datasets and domains examined, and may not generalize to other contexts.
Future Directions
The paper concludes by outlining several key directions for future work:
- Developing causal effect estimands that jointly model sensitive attributes and social determinants.
- Exploring hybrid inference methods that combine causal reasoning and relational learning to better capture the interactions between individuals and their surrounding contexts.
- Expanding the empirical evaluation of social-determinant-based fairness approaches across a wider range of domains and datasets.
