What are the hyperparameters that are tested while choosing the best models?
These are the hyperparameters that were being tested:
– MinNumObj: it indicates the minimum number of outcome per leaf. It is used to do an early stopping. If the leaf contains a really small number of objects or they belong to the same category, it is better to remove the leaf in order to have a more general model.
– Confidence factor: it is used to determine if it is necessary to cut a branch. Used for pruning. Smaller values mean more cutting (more pruning). It has been increased in order to avoid overfitting.
– Num folds: this helps to reduce error in pruning. It did not help the model, so it has been excluded from the hyperparameters to test.
For each test, the results have been memorized as a result in the percentage split (70% has been used) and the false positive for these models, and the cross-validation with ten folders and the false positive for these models. Ad mention before, variables not used by each algorithm and variables removed from the model have also been memorizing.