A technique used to evalute how different the test set is to the training set, commonly used in Kaggle competitions to assess likelihood of a Leaderboard Shakeup. It's a form of Black-box Tests for datasets.
- Label each element of a dataset
1if in train or
- Train a model to predict label using features from the set.
If the model performs well (> 0.5 ROC) then there is likely a difference between datasets.