What is types of bias and where they enter the pipeline?

Responsible AI — Fairness Metrics, Bias Types & Mitigation: Types of bias and where they enter the pipeline. Learn more in the LumiChats AI Glossary at https://lumichats.com/glossary/responsible-ai-fairness-metrics

What is practice questions?

Responsible AI — Fairness Metrics, Bias Types & Mitigation: Practice questions. Learn more in the LumiChats AI Glossary at https://lumichats.com/glossary/responsible-ai-fairness-metrics

Responsible AI — Fairness Metrics, Bias Types & Mitigation

Responsible AI operationalises ethical principles into measurable requirements and engineering practices. Fairness metrics mathematically define what 'fair' means for a specific context — demographic parity, equalised odds, individual fairness. Each captures different notions of fairness and they are mathematically incompatible (fairness impossibility theorem). Data bias analysis identifies where discrimination enters the pipeline. Mitigation techniques apply pre-processing, in-processing, or post-processing corrections. Responsible AI frameworks (Google, Microsoft, IBM, Anthropic) translate these into engineering guidelines.

Quantifying and reducing discrimination in AI systems — beyond gut feel to mathematical guarantees.

Category: AI Safety & Ethics

Fairness metrics — what does fair mean mathematically?

import numpy as np
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix

try:
    from fairlearn.metrics import (demographic_parity_difference,
        equalized_odds_difference, MetricFrame)
    from fairlearn.postprocessing import ThresholdOptimizer
    from fairlearn.reductions import ExponentiatedGradient, DemographicParity
    FAIRLEARN = True
except ImportError:
    FAIRLEARN = False

# ── Simulate biased lending dataset ──
np.random.seed(42)
n = 2000
# Sensitive attribute: group A = majority, group B = minority
sensitive = np.random.choice(['A', 'B'], n, p=[0.7, 0.3])

# Income correlated with group (reflects historical inequality)
income  = np.where(sensitive == 'A',
                   np.random.normal(60, 15, n),
                   np.random.normal(48, 18, n))
credit  = np.where(sensitive == 'A',
                   np.random.normal(680, 60, n),
                   np.random.normal(640, 70, n))
debt    = np.random.normal(0.35, 0.1, n)

X = np.column_stack([income, credit, debt])
# True creditworthiness (unbiased ground truth)
y_true = (income * 0.3 + credit * 0.02 - debt * 100 > 60).astype(int)

X_train, X_test, y_train, y_test, s_train, s_test = train_test_split(
    X, y_true, sensitive, test_size=0.3, random_state=42)

# Train model (will learn historical disparities)
clf = LogisticRegression(random_state=42)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)

# ── Manual fairness metrics ──
mask_A = s_test == 'A'
mask_B = s_test == 'B'

def approval_rate(predictions, mask):
    return predictions[mask].mean()

def tpr(predictions, labels, mask):  # True Positive Rate (recall)
    tp = ((predictions == 1) & (labels == 1) & mask).sum()
    fn = ((predictions == 0) & (labels == 1) & mask).sum()
    return tp / (tp + fn) if (tp + fn) > 0 else 0

def fpr(predictions, labels, mask):  # False Positive Rate
    fp = ((predictions == 1) & (labels == 0) & mask).sum()
    tn = ((predictions == 0) & (labels == 0) & mask).sum()
    return fp / (fp + tn) if (fp + tn) > 0 else 0

print("Fairness Metrics:")
print(f"{'Metric':<35} {'Group A':>10} {'Group B':>10} {'Difference':>12}")
print("-" * 70)

apr_A = approval_rate(y_pred, mask_A)
apr_B = approval_rate(y_pred, mask_B)
print(f"{'Approval rate (Demographic Parity)':<35} {apr_A:>10.3f} {apr_B:>10.3f} {apr_A-apr_B:>+12.3f}")

tpr_A = tpr(y_pred, y_test, mask_A)
tpr_B = tpr(y_pred, y_test, mask_B)
print(f"{'True Positive Rate (Equalised TPR)':<35} {tpr_A:>10.3f} {tpr_B:>10.3f} {tpr_A-tpr_B:>+12.3f}")

fpr_A = fpr(y_pred, y_test, mask_A)
fpr_B = fpr(y_pred, y_test, mask_B)
print(f"{'False Positive Rate':<35} {fpr_A:>10.3f} {fpr_B:>10.3f} {fpr_A-fpr_B:>+12.3f}")

# ── Fairlearn for automated fairness analysis ──
if FAIRLEARN:
    dpd = demographic_parity_difference(y_test, y_pred, sensitive_features=s_test)
    eod = equalized_odds_difference(y_test, y_pred, sensitive_features=s_test)
    print(f"
Fairlearn: DPD = {dpd:.3f}, EOD = {eod:.3f}")
    # DPD = 0: perfect demographic parity. |DPD| > 0.1 = concerning

    # ── Post-processing mitigation: threshold adjustment ──
    # Adjust classification threshold per group to equalise fairness metric
    postprocess = ThresholdOptimizer(
        estimator=clf,
        constraints="equalized_odds",   # Equalise TPR and FPR across groups
        predict_method="predict_proba",
        objective="balanced_accuracy_score"
    )
    postprocess.fit(X_train, y_train, sensitive_features=s_train)
    y_pred_fair = postprocess.predict(X_test, sensitive_features=s_test)

    dpd_after = demographic_parity_difference(y_test, y_pred_fair, sensitive_features=s_test)
    print(f"After post-processing: DPD = {dpd_after:.3f} (was {dpd:.3f})")

Types of bias and where they enter the pipeline

Bias type	Where it enters	Example	Mitigation
Historical bias	Training data reflects past discrimination	Hiring data: 80% male candidates historically hired	Reweighting, causal analysis, new data collection
Representation bias	Some groups under-represented in training data	Facial recognition trained on 90% lighter skin tones	Data augmentation, diverse data collection
Measurement bias	Proxies used instead of true target variable	Using arrest rate (proxy) instead of crime rate (true target)	Careful feature selection, domain expert review
Aggregation bias	One model for all groups when they differ	Single medical model for diverse demographic groups	Group-specific models or features
Evaluation bias	Benchmark does not represent all groups	Image benchmark with no dark-skinned faces	Disaggregated evaluation metrics
Deployment bias	System used differently than intended	Hiring AI used for promotion decisions it was not designed for	Use-case scoping, deployment monitoring

The fairness impossibility theorem: Chouldechova (2017): it is mathematically impossible to simultaneously satisfy demographic parity AND calibration AND equalised odds when base rates differ between groups. You cannot be fair in all senses at once when underlying rates differ. This means every fairness-aware AI system makes a value choice about which type of fairness to prioritize — a technical decision with ethical and legal consequences. This choice should be made explicitly and transparently, not by accident.

Practice questions

A loan model has approval rates of 72% for Group A and 48% for Group B. Is this unfair? (Answer: Demographic parity difference = 72% - 48% = 24 percentage points. This is likely unfair, but context matters. If the groups have genuinely different creditworthiness distributions (different income, employment stability), some difference may be "fair" by equalised odds. The key question: does the model add discrimination BEYOND what the legitimate features already encode?)
What is the difference between demographic parity and equalised odds? (Answer: Demographic parity: equal approval rates across groups regardless of actual qualification. Equalised odds: equal true positive rates AND false positive rates across groups — qualified individuals in both groups are equally likely to be approved, and unqualified individuals in both groups are equally likely to be rejected. Equalised odds is generally considered more fair as it conditions on actual qualifications.)
What is historical bias in AI and why is it self-perpetuating? (Answer: Historical bias: training data reflects past discriminatory decisions. A model trained on historical hiring data learns that women and minorities were hired less — it perpetuates this. Self-perpetuating: biased AI makes biased decisions → those decisions generate new training data → next model is trained on biased outcomes → more biased decisions. Without intervention, AI amplifies historical injustice rather than correcting it.)
Post-processing mitigation adjusts thresholds per demographic group. What is the risk? (Answer: Post-processing typically uses group membership as input to make different decisions for different groups. While this can equalise outcomes, it may violate anti-discrimination laws that prohibit using protected characteristics as inputs to hiring/lending decisions — even to correct bias. There is a legal and ethical tension between using group membership to mitigate bias vs using group membership as a decision factor.)
A medical AI achieves 95% accuracy overall but 82% for Black patients. The overall metric hides this disparity. What evaluation practice should be standard? (Answer: Disaggregated evaluation: report metrics separately for each demographic subgroup (race, gender, age, socioeconomic status, geography). Aggregate accuracy can mask severe disparities. Best practice: define minimum acceptable performance for all subgroups as a deployment requirement, not just average performance. Model cards should include disaggregated metrics as standard.)

Anthropic publishes model cards for Claude documenting known limitations and demographic performance disparities. Understanding fairness metrics helps you critically evaluate these disclosures and hold AI companies accountable for the disparate impacts of their systems.

import numpy as np from sklearn.datasets import make_classification from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split from sklearn.metrics import confusion_matrix try: from fairlearn.metrics import (demographic_parity_difference, equalized_odds_difference, MetricFrame) from fairlearn.postprocessing import ThresholdOptimizer from fairlearn.reductions import ExponentiatedGradient, DemographicParity FAIRLEARN = True except ImportError: FAIRLEARN = False # ── Simulate biased lending dataset ── np.random.seed(42) n = 2000 # Sensitive attribute: group A = majority, group B = minority sensitive = np.random.choice(['A', 'B'], n, p=[0.7, 0.3]) # Income correlated with group (reflects historical inequality) income = np.where(sensitive == 'A', np.random.normal(60, 15, n), np.random.normal(48, 18, n)) credit = np.where(sensitive == 'A', np.random.normal(680, 60, n), np.random.normal(640, 70, n)) debt = np.random.normal(0.35, 0.1, n) X = np.column_stack([income, credit, debt]) # True creditworthiness (unbiased ground truth) y_true = (income * 0.3 + credit * 0.02 - debt * 100 > 60).astype(int) X_train, X_test, y_train, y_test, s_train, s_test = train_test_split( X, y_true, sensitive, test_size=0.3, random_state=42) # Train model (will learn historical disparities) clf = LogisticRegression(random_state=42) clf.fit(X_train, y_train) y_pred = clf.predict(X_test) # ── Manual fairness metrics ── mask_A = s_test == 'A' mask_B = s_test == 'B' def approval_rate(predictions, mask): return predictions[mask].mean() def tpr(predictions, labels, mask): # True Positive Rate (recall) tp = ((predictions == 1) & (labels == 1) & mask).sum() fn = ((predictions == 0) & (labels == 1) & mask).sum() return tp / (tp + fn) if (tp + fn) > 0 else 0 def fpr(predictions, labels, mask): # False Positive Rate fp = ((predictions == 1) & (labels == 0) & mask).sum() tn = ((predictions == 0) & (labels == 0) & mask).sum() return fp / (fp + tn) if (fp + tn) > 0 else 0 print("Fairness Metrics:") print(f"{'Metric':<35} {'Group A':>10} {'Group B':>10} {'Difference':>12}") print("-" * 70) apr_A = approval_rate(y_pred, mask_A) apr_B = approval_rate(y_pred, mask_B) print(f"{'Approval rate (Demographic Parity)':<35} {apr_A:>10.3f} {apr_B:>10.3f} {apr_A-apr_B:>+12.3f}") tpr_A = tpr(y_pred, y_test, mask_A) tpr_B = tpr(y_pred, y_test, mask_B) print(f"{'True Positive Rate (Equalised TPR)':<35} {tpr_A:>10.3f} {tpr_B:>10.3f} {tpr_A-tpr_B:>+12.3f}") fpr_A = fpr(y_pred, y_test, mask_A) fpr_B = fpr(y_pred, y_test, mask_B) print(f"{'False Positive Rate':<35} {fpr_A:>10.3f} {fpr_B:>10.3f} {fpr_A-fpr_B:>+12.3f}") # ── Fairlearn for automated fairness analysis ── if FAIRLEARN: dpd = demographic_parity_difference(y_test, y_pred, sensitive_features=s_test) eod = equalized_odds_difference(y_test, y_pred, sensitive_features=s_test) print(f" Fairlearn: DPD = {dpd:.3f}, EOD = {eod:.3f}") # DPD = 0: perfect demographic parity. |DPD| > 0.1 = concerning # ── Post-processing mitigation: threshold adjustment ── # Adjust classification threshold per group to equalise fairness metric postprocess = ThresholdOptimizer( estimator=clf, constraints="equalized_odds", # Equalise TPR and FPR across groups predict_method="predict_proba", objective="balanced_accuracy_score" ) postprocess.fit(X_train, y_train, sensitive_features=s_train) y_pred_fair = postprocess.predict(X_test, sensitive_features=s_test) dpd_after = demographic_parity_difference(y_test, y_pred_fair, sensitive_features=s_test) print(f"After post-processing: DPD = {dpd_after:.3f} (was {dpd:.3f})")

Bias type

Where it enters

Example

Mitigation

Historical bias

Training data reflects past discrimination

Hiring data: 80% male candidates historically hired

Reweighting, causal analysis, new data collection

Representation bias

Some groups under-represented in training data

Facial recognition trained on 90% lighter skin tones

Data augmentation, diverse data collection

Measurement bias

Proxies used instead of true target variable

Using arrest rate (proxy) instead of crime rate (true target)

Careful feature selection, domain expert review

Aggregation bias

One model for all groups when they differ

Single medical model for diverse demographic groups

Group-specific models or features

Evaluation bias

Benchmark does not represent all groups

Image benchmark with no dark-skinned faces

Disaggregated evaluation metrics

Deployment bias

System used differently than intended

Hiring AI used for promotion decisions it was not designed for

Use-case scoping, deployment monitoring

Responsible AI — Fairness Metrics, Bias Types & Mitigation

Fairness metrics — what does fair mean mathematically?

Types of bias and where they enter the pipeline

Practice questions

Responsible AI — Fairness Metrics, Bias Types & Mitigation

Fairness metrics — what does fair mean mathematically?

Types of bias and where they enter the pipeline

Practice questions

Practice what you just learned

Related Terms