Hyperopt loss
Created by: APJansen
Improving hyperoptimization, experimenting with different hyperoptimization loss functions
Tasks done in this PR
-
Added a new HyperLoss
class with buit-in methods that can automatically perform statistics over replicas and then folds. The user can select statistics viareplica_statistic
andfold_statistic
in runcardkfold
. -
Added new a statistics over replicas, \varphi^2_{k}, that (in addition to \chi^{2}) can be selected via loss_type
option in runcardkfold
. -
Addressed #1894 (closed). Here, regardless the selected metrics to minimise by Hyperopt
(\varphi^2 or \chi^{2}), we also print intries.json
(specifically withinkfold_meta
entry) a matrix (folds x replicas) of calculated \chi^{2} values namedhyper_losses_chi2
and a vector (folds) of \varphi^2 values namedhyper_losses_phi2
.
Description
The implemented HyperLoss
is instantiated within ModelTrainer
and later on used in ModelTrainer.hyperparametrizable
. The user must pass three paramaters that are set in the runcard:
-
loss_type
: The type of loss to be used. Options arechi2
orphi2
. -
replica_statistic
: The statistics over replicas to be used within each fold. Forloss_type = chi2
, it can assume the usual statistics:average
,best_worst
andstd
. Note:replica_statistic
is inactive ifloss_type = phi2
as \varphi^2_{k} is by definition a statistics over replicas. -
fold_statistic
: The statistics over folds. Options are:average
,best_worst
andstd
. - As discussed by @juanrojochacon, the calculation of any \varphi^2 statistics over folds is done with the reciprocal of the chosen function. For example, for
loss_type: phi2
andfold_statistic: average
, the figure of merit to be minimised is actually: \Huge \left( \frac{1}{n_\text{fold}} \sum_{k=1}^{n_\text{fold}} \varphi^2_{k} \right)^{-1}
The current implementation of \varphi^2_{k} is based on validphys
functions. It is evaluated using only experimental data within the hold out fold (as expected).
Runcard examples
- Default run:
hyper_loss
is set as the \chi^2 averaged over replicas and then over folds.
kfold:
loss_type: chi2
replica_statistic: average
fold_statistic: average
penalties:
- saturation
- patience
- integrability
...
- Setting \varphi^2_{k} within each k-fold and then set
hyper_loss
as the inverse of the max value of [\varphi^2_{1}, \varphi^2_{2}, ..., \varphi^2_{k}]:
kfold:
loss_type: phi2
fold_statistic: best_worst
penalties:
- saturation
- patience
- integrability
...
Notes
It must be merged after #1788 as the current hyperopt_loss
branch has been created from trvl-mask-layers
.