Model Selection & Hyperparameter Tuning
Four models were benchmarked: Logistic Regression (baseline), Random Forest, LightGBM, and XGBoost.
XGBoost achieved the highest AUC (0.893) and was selected for production.
Hyperparameter search used Optuna (Bayesian optimisation, 200 trials):
max_depth: 6, learning_rate: 0.05, n_estimators: 400, subsample: 0.8, colsample_bytree: 0.7, scale_pos_weight: 11.5
Threshold calibration: operating threshold set at 0.38 (optimised for F1 on validation set, not default 0.5).