Activity 11: Fairness Live#

2026-03-05

Setup#

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import confusion_matrix

Creditworthiness prediction#

A bank uses an ML model to predict whether a credit card holder will be a “good” user of credit: (\(y = 1\) if they are in good standing, \(y = 0\) if they default or miss payments).

The dataset contains features like credit limit, payment history, education level, and sex.

UCI dataset description

# Load pre-split credit dataset
X_train = pd.read_csv('~/COMSC-335/data/credit/X_train.csv')
y_train = pd.read_csv('~/COMSC-335/data/credit/y_train.csv').values.flatten()
X_test = pd.read_csv('~/COMSC-335/data/credit/X_test.csv')
y_test = pd.read_csv('~/COMSC-335/data/credit/y_test.csv').values.flatten()
A_train = pd.read_csv('~/COMSC-335/data/credit/A_train.csv').values.flatten()
A_test = pd.read_csv('~/COMSC-335/data/credit/A_test.csv').values.flatten()

# add sex as a feature
X_train['SEX_male'] = (A_train == 'male').astype(int)
X_test['SEX_male'] = (A_test == 'male').astype(int)
print(f"Training set: {X_train.shape[0]} examples, {X_train.shape[1]} features")
print(f"Test set:     {X_test.shape[0]} examples")
print(f"Default rate: {y_train.mean():.2f}")
Training set: 8626 examples, 33 features
Test set:     10500 examples
Default rate: 0.50
X_train.columns
Index(['LIMIT_BAL', 'AGE', 'PAY_1', 'PAY_2', 'PAY_3', 'PAY_4', 'PAY_5',
       'PAY_6', 'BILL_AMT1', 'BILL_AMT2', 'BILL_AMT3', 'BILL_AMT4',
       'BILL_AMT5', 'BILL_AMT6', 'PAY_AMT1', 'PAY_AMT2', 'PAY_AMT3',
       'PAY_AMT4', 'PAY_AMT5', 'PAY_AMT6', 'EDUCATION_0', 'EDUCATION_1',
       'EDUCATION_2', 'EDUCATION_3', 'EDUCATION_4', 'EDUCATION_5',
       'EDUCATION_6', 'MARRIAGE_0', 'MARRIAGE_1', 'MARRIAGE_2', 'MARRIAGE_3',
       'Interest', 'SEX_male'],
      dtype='str')

Part 1: Group accuracy#

Let’s fit a model and check the accuracy for each protected group.

StandardScaler

To help with model training, we often transform the features so that they have mean 0 and standard deviation 1. If the features are all roughly on the same scale, this helps gradient descent converge faster.

Scikit-learn provides a StandardScaler class that does this, and it follows the same fit/ transform pattern as the PolynomialFeatures class we saw earlier in the course.

# Scale features to have mean 0 and std 1
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# Fit a logistic regression
model = LogisticRegression()
model.fit(X_train_scaled, y_train)

y_pred = model.predict(X_test_scaled)

print(f"Overall accuracy: {np.mean(y_pred == y_test):.4f}")
Overall accuracy: 0.7908
y_test[is_female].shape
(6379,)
# TODO: create a boolean mask for male and female using the X_test dataframe
is_male = X_test['SEX_male'] == 1
is_female = X_test['SEX_male'] == 0


# TODO: compute accuracy for male and female separately, using the boolean masks above
acc_male = np.mean(y_pred[is_male] == y_test[is_male])
acc_female = np.mean(y_pred[is_female] == y_test[is_female])

print(f"Male accuracy:   {acc_male:.4f}")
print(f"Female accuracy: {acc_female:.4f}")
Male accuracy:   0.8362
Female accuracy: 0.7614

Part 2: Implementing fairness through unawareness#

Let’s drop our sensitive attribute from the features and see if the gap in model accuracy closes.

# TODO: drop the 'SEX_male' column from X_train and X_test
X_train_no_sex = X_train.drop(columns=['SEX_male'])
X_test_no_sex = X_test.drop(columns=['SEX_male'])
# Retrain without SEX
scaler2 = StandardScaler()
X_train_no_sex_scaled = scaler2.fit_transform(X_train_no_sex)
X_test_no_sex_scaled = scaler2.transform(X_test_no_sex)

ftu_model = LogisticRegression()
ftu_model.fit(X_train_no_sex_scaled, y_train)

y_pred_ftu = ftu_model.predict(X_test_no_sex_scaled)

# TODO how would we select the probabilities that P(y=1) for each example?
y_scores_ftu = ftu_model.predict_proba(X_test_no_sex_scaled)[:, 1]

acc_male_ftu = np.mean(y_pred_ftu[is_male] == y_test[is_male])
acc_female_ftu = np.mean(y_pred_ftu[is_female] == y_test[is_female])

print(f"Overall accuracy: {np.mean(y_pred_ftu == y_test):.4f}")
print()
print(f"WITH SEX:    Male acc = {acc_male:.4f}, Female acc = {acc_female:.4f}")
print(f"WITHOUT SEX: Male acc = {acc_male_ftu:.4f}, Female acc = {acc_female_ftu:.4f}")
Overall accuracy: 0.7910

WITH SEX:    Male acc = 0.8362, Female acc = 0.7614
WITHOUT SEX: Male acc = 0.8469, Female acc = 0.7548

Discuss with folks around you: why might the disparity in accuracy between the two groups persist, even though we removed the sensitive attribute?

Part 3: Group-specific confusion matrices#

Instead of hiding the sensitive attribute, let’s now measure how the model treats each group.

The sklearn.metrics package gives us convenient functions to compute the metrics we have been learning, including the confusion matrix.

from sklearn.metrics import confusion_matrix

# TODO call confusion_matrix on y_test and y_pred_ftu
confusion_matrix(y_test, y_pred_ftu)#.flatten()
array([1820,  503, 1692, 6485])

The true positive rate and false positive rates are defined as:

\[ \text{TPR} = \frac{\text{TP}}{\text{TP} + \text{FN}} = Pr(\hat{y} = 1 \mid y = 1) \]
\[ \text{FPR} = \frac{\text{FP}}{\text{FP} + \text{TN}} = Pr(\hat{y} = 1 \mid y = 0) \]

Complete the code below to compute the sklearn confusion matrix for each group separately and extract TPR (true positive rate) and FPR (false positive rate).

We’ll use the y_pred_ftu predictions from the model without sex: ftu_model.

# Common python coding pattern: loop over and extract tuple elements
for group, mask in [('Male', is_male), ('Female', is_female)]:
    # TODO: compute confusion matrix for the is_male and is_female groups using y_pred_ftu
    c_mat = confusion_matrix(y_test[mask], y_pred_ftu[mask])

    # flatten() makes it easier to extract the elements
    tn, fp, fn, tp = c_mat.flatten()

    # TODO compute TPR and FPR
    tpr = tp / (tp + fn)
    fpr = fp / (fp + tn)

    print(f"{group} | TPR: {tpr:.4f} | FPR: {fpr:.4f}")
Male | TPR: 0.8468 | FPR: 0.1529
Female | TPR: 0.7594 | FPR: 0.2620

Ponder with folks around you: given that the positive label (\(y = 1\)) is “good credit user” and we were using our model to help with housing loan approvals, should we prioritize making TPR or FPR fairer?

Part 4: Equality of opportunity threshold exploration#

Let’s adjust the threshold for the is_female group to try and satisfy the equality of opportunity criterion.

# TODO change this threshold
threshold = 0.37

# TODO: compute predictions for the is_female group at a lower threshold 
# using y_scores_ftu
y_pred_female_lower_t = (y_scores_ftu[is_female] >= threshold).astype(int)

# Compute TPR and FPR at the new threshold
cm_new = confusion_matrix(y_test[is_female], y_pred_female_lower_t)
tn, fp, fn, tp = cm_new.flatten()
tpr_lower_t = tp / (tp + fn)
fpr_lower_t = fp / (fp + tn)

print(f"Female, threshold={threshold} | TPR={tpr_lower_t:.4f} | FPR={fpr_lower_t:.4f}")
Female, threshold=0.37 | TPR=0.8461 | FPR=0.3661

What threshold have you found that satisfies the equality of opportunity criterion?

Your response: https://pollev.com/tliu

What do you observe about the FPR at this threshold?