Optimization methods for risk scores learning

Student: Luca Ciabini, 2021

Risk scores are classification models that predict the risk of an event using a value defined by the sum of a few integers. They are particularly interesting as they are easy to learn, to use, to validate and, besides, are transparent.
In this thesis we present some improvements to the LCPA algorithm used to solve the RiskSlim problem and create optimized risk scores.
In particular we study feature selection and continuos features handling techniques, trying to embed a partial one-hot encoding in the optimization problem.
Experimental results show how the combination of feature selection and continuos features optimal split brings improvements to the model in terms of quality without affecting too much the learning time.