You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Run k-fold Cross validation seems too many training runs and gives an unbiased estimate but with high variance. What is our next best option, given we want to run atmost 3 training runs per dataset per architecture?
Monte Carlo Cross Validation is an option. Cons: gives biased estimate but lower variance.
Correcting for bias in Monte Carlo Cross Validation
n1 training set, n2 test set, J such splits represents Monte Carlo Cross Validation
taking many such splits (J larger) is good.
in gist, using the 1st method: say we split 100 samples into 80 train and 20 test samples. We do this split 3 times(monte carlo CV) i.e. J = 3. Then, corrected variance = (1/J + n2/n1)*uncorrected variance = (1/3 + 1/4)*uncorrected variance.
The second method require modification, does not seem feasible here.
Run k-fold Cross validation seems too many training runs and gives an unbiased estimate but with high variance. What is our next best option, given we want to run atmost 3 training runs per dataset per architecture?
Monte Carlo Cross Validation is an option. Cons: gives biased estimate but lower variance.
Correcting for bias in Monte Carlo Cross Validation
n1 training set, n2 test set, J such splits represents Monte Carlo Cross Validation
taking many such splits (J larger) is good.
in gist, using the 1st method: say we split 100 samples into 80 train and 20 test samples. We do this split 3 times(monte carlo CV) i.e. J = 3. Then, corrected variance = (1/J + n2/n1)*uncorrected variance = (1/3 + 1/4)*uncorrected variance.
The second method require modification, does not seem feasible here.
Relevant references:
Nadeau and Bengio, 2003
Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning
The text was updated successfully, but these errors were encountered: