The below discussion was triggered by the following forum post: https://root-forum.cern.ch/t/proper-obtaining-and-comparison-of-results-from-hyperparameteroptimisation/33838/4
Currently the hyper parameter optimisation use case is made available in TMVA through the OptimizeConfigParameter and HyperParameterOptimisation classes.
OptimizeConfigParameter: The work horse of the HPO. Currently does a grid search on a pre-determied range, specialised for each method that supports it. (BDT, SVM, ...)
HyperParameterOptimisation: Incomplete wrapper that aims to provide some additional functionality on top of OptimizeConfigParameter, namely handling of a three way split (train/valid/test), and support for uncertainty analysis (through multiple fold evaluations).
I think the situation is thus: HPO is provided through OptimizeConfigParameters, with the notable problem that it provides only a non-user configurable search space and no possible way of splitting data into subsets beyond the TMVA standard training/test.
The HyperParameterOptimisation class was introduced with the intent of addressing these shortcomings but, for whatever reason, was never finished. It currently support splitting the data into multiple sets (train/valid/test), but cannot access the test set, leaving the user in a state where it's difficult to evaluate the optimised parameters on independent data. This problem is compounded by the fact the class is separate from the Factory with its Train/Test/EvaluateAllMethods.
Additionally the HyperParameterOptimisation class supports multiple optimisations on slightly decorrelated datasets using k-folds. (Note: This is not nested cross-validation.) It is unclear, however, how to use the results of this optimisation in the case of K>1. Also, the results are not actually saved (incomplete implementation).
As such it is unclear if we want a separate class for HPO. Rather it should be considered part of the normal workflow, but instead of fixing some parameters you specify a range.
- User-visible way for performing the 3-way split (currently provided by HyperParameterOptimisation class).
- User-visible way of moving data from original training set and test set, back to the "active" training and test sets.
- A user-visible way to specify parameter search space.
- Reporting of relevant metrics to the user. E.g. return figure-of-merits for all points evaluated incl. the final, best, figure-of-merit.
One potential way of integrating into the TMVA workflow could then be: