Hyperopt loss (!1726) · Merge requests · Emanuele Roberto Nocera / nnpdf

Emanuele Roberto Nocera requested to merge hyperopt_loss into master May 04, 2023

Created by: APJansen

Improving hyperoptimization, experimenting with different hyperoptimization loss functions

Tasks done in this PR

Added a new HyperLossclass with buit-in methods that can automatically perform statistics over replicas and then folds. The user can select statistics via replica_statistic and fold_statistic in runcard kfold.
Added new a statistics over replicas, \varphi^2_{k}, that (in addition to \chi^{2}) can be selected via loss_type option in runcard kfold.
Addressed #1894 (closed). Here, regardless the selected metrics to minimise by Hyperopt (\varphi^2 or \chi^{2}), we also print in tries.json (specifically within kfold_meta entry) a matrix (folds x replicas) of calculated \chi^{2} values named hyper_losses_chi2 and a vector (folds) of \varphi^2 values named hyper_losses_phi2.

Description

The implemented HyperLoss is instantiated within ModelTrainer and later on used in ModelTrainer.hyperparametrizable. The user must pass three paramaters that are set in the runcard:

loss_type: The type of loss to be used. Options are chi2 or phi2.
replica_statistic: The statistics over replicas to be used within each fold. For loss_type = chi2, it can assume the usual statistics: average, best_worst and std. Note: replica_statistic is inactive if loss_type = phi2 as \varphi^2_{k} is by definition a statistics over replicas.
fold_statistic: The statistics over folds. Options are: average, best_worst and std.
As discussed by @juanrojochacon, the calculation of any \varphi^2 statistics over folds is done with the reciprocal of the chosen function. For example, for loss_type: phi2 and fold_statistic: average, the figure of merit to be minimised is actually: \Huge \left( \frac{1}{n_\text{fold}} \sum_{k=1}^{n_\text{fold}} \varphi^2_{k} \right)^{-1}

The current implementation of \varphi^2_{k} is based on validphys functions. It is evaluated using only experimental data within the hold out fold (as expected).

Runcard examples

Default run: hyper_loss is set as the \chi^2 averaged over replicas and then over folds.

kfold:
      loss_type: chi2
      replica_statistic: average
      fold_statistic: average
      penalties:
        - saturation
        - patience
        - integrability
...

Setting \varphi^2_{k} within each k-fold and then set hyper_loss as the inverse of the max value of [\varphi^2_{1}, \varphi^2_{2}, ..., \varphi^2_{k}]:

kfold:
      loss_type: phi2
      fold_statistic: best_worst
      penalties:
        - saturation
        - patience
        - integrability
...

Notes

It must be merged after #1788 as the current hyperopt_loss branch has been created from trvl-mask-layers.

Hyperopt loss

Tasks done in this PR

Description

Runcard examples

Notes

Merge request reports