Under the Hood

The screening tool leverages three industry-standard machine learning paradigms, all serialized via Scikit-Learn pipelines.

Logistic Regression

Known for high interpretability, our LR model provides a strong linear baseline, plotting the log-odds of the target classification against our 36 validated boolean and ordinal features.

Random Forest

An ensemble learning method consisting of multiple decision trees. This model excels at capturing non-linear relationships and is highly robust against data outliers and feature noise.

XGBoost

Extreme Gradient Boosting provides state-of-the-art predictive performance. By iteratively minimizing the loss function through gradient descent, it handles unbalanced clinical datasets with unparalleled accuracy.