Monte Carlo Cross Validation #27

msrepo · 2023-05-25T04:06:30Z

Run k-fold Cross validation seems too many training runs and gives an unbiased estimate but with high variance. What is our next best option, given we want to run atmost 3 training runs per dataset per architecture?

Monte Carlo Cross Validation is an option. Cons: gives biased estimate but lower variance.
Correcting for bias in Monte Carlo Cross Validation

n1 training set, n2 test set, J such splits represents Monte Carlo Cross Validation

taking many such splits (J larger) is good.

in gist, using the 1st method: say we split 100 samples into 80 train and 20 test samples. We do this split 3 times(monte carlo CV) i.e. J = 3. Then, corrected variance = (1/J + n2/n1)*uncorrected variance = (1/3 + 1/4)*uncorrected variance.

The second method require modification, does not seem feasible here.

Relevant references:
Nadeau and Bengio, 2003
Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning

msrepo self-assigned this May 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monte Carlo Cross Validation #27

Monte Carlo Cross Validation #27

msrepo commented May 25, 2023 •

edited

Loading

Monte Carlo Cross Validation #27

Monte Carlo Cross Validation #27

Comments

msrepo commented May 25, 2023 • edited Loading

msrepo commented May 25, 2023 •

edited

Loading