Deploying Machine Learning to production is hard for several reasons. One reason is data drift. That is when the training and testing set no longer correspond to the world we have modeled. As engineers and statisticians, we tend to input and output validate through logic and statistics to account for drift and outliers. However, these implementations imply that a model is running in production. Thus, we are taking the risk of deploying models without understanding their implications.

I, therefore, propose a fourth data set definition. The qualification set. The qualification set is different from the testing and validation sets. It is not defined by the real-world data distributions but instead of tails, corner cases, and testable observations that are derived through domain knowledge and curiosity. Therefore, before deployment, the purpose of the qualification set is to assert that the machine learning system will qualify to all imaginable scenarios.

Hello qualification set!

Nicholai Stålung