Machine Learning

Measuring and Mitigating Algorithmic Bias

Bias in ML

Human Bias

Data Bias

bias in ML

Historical bias

Representation/Reporting bias

Measurement bias

overcoming data bias

Model bias

Model fit

Biased Loss Function

Overcoming model bias

Evaluation/Deployment Bias

Evaluation bias

Deployment bias

Overcoming

Machine Learning Pipeline

ml pipeline

Measurement

Learning

Action

Feedback

Demographic Disparity/Sample Size disparity

Measuring Fairness

Sensitive Attributes

Fairness through unawareness

Fairness Criteria

Positive predictive value/precision

\[PPV = P = \frac{TP}{TP+FP}\]

True positive rate/Recall

\[TPR = R = \frac{TP}{TP+FN}\]

False Negative rate

\[FNR = \frac{FN}{TP+FN} = 1-TPR\]

Accuracy

\[Acc = \frac{TP+TN}{TP+TN+FP+FN}\]

Example problem

Criterion 1: Group Fairness/Demographic Parity

\[P(\hat Y = 1 | A = m) = P(\hat Y = 1 | A = h)\]

Criterion 2: Predictive Parity

\[P(Y=1|\hat Y=1,A=m) = P(Y=1|\hat Y = 1, A=h)\]

Criterion 3: Equal Opportunity

\(P(\hat Y=0|Y=1,A=m) = P(\hat Y = 0|Y=1,A=h)\) equivalently with true positives: \(P(\hat Y=1|Y=1,A=m) = P(\hat Y = 1|Y=1,A=h)\)

Criterion 4: Individual Fairness

\[P(\hat Y = 1| A_i, X_i) \approx P(\hat Y_j=1|A_j, X_j) \quad\text{if}\quad sim(X_i,X_j) < \theta\]

Other criteria

Fairness Evaluation

GAP measures

\[GAP_{avg} = \frac{1}{G}\sum_{g=1}^G |\phi_g - \phi|\] \[GAP_{max} = \max_{g\in G}|\phi_g-\phi|\]

Creating Fairer Classifiers

Pre-processing

balance the data set

reweight data instances

\[P_{exp}(A=a, Y=1) = P(A=a)P(Y=1) = \frac{count(A=a)}{|D|}\frac{count(Y=1)}{|D|}\] \[P_{obs}(A=a,Y=1) = \frac{count(Y=1,A=a)}{|D|}\] \[W(X_i=\{x_i,a_i,y_i\}) = \frac{P_{exp}(A=a_i,Y=y_i)}{P_{obs}(A=a_i,Y=y_i)}\]

Model training/optimisation

add constraints to optimisation function

adversarial training

adversarial

Post-processing

modify classifier predictions

pros

cons


Edit this page.