27 February 2017

Model Evaluation Techniques

data mining map

Things to watch out for in cross-validation:
  • when training data forms a representative sample of population, new data should have representative coverage of this training data, otherwise estimates are optimistic and as such minimize the bias in training data
  • when working with temporal datasets, structure the cross-validation so all training set data is collected before the testing set
  • when working with larger number of k-folds, the better the error estimates will be but longer the program will take to run, 10-folds or more is better, for models that predict quickly use leave-one-out cross validation

ROC curves are applicable on binary classification where predictions are divided into negative and positive classes. The area under the ROC curve is the AUC or the area under the curve which is another evaluation metric. On multiclass one uses the one-versus-all trick. In most cases of multiclass, one uses both the ROC curve and the confusion matrix. The confusion matrix shows the class-wise accuracy using a two-by-two diagram. Regression performance is measured using the root-mean-squared error, MSE, or R-squared. Other regression evaluation metrics include: AIC and BIC. A brute-force grid search is a standard way to optimize the choice of tuning parameters which ties the strategies between cross validation and model evaluation.