Skip to content

Hyperopt loss

Emanuele Roberto Nocera requested to merge hyperopt_loss into master

Created by: APJansen

Improving hyperoptimization, experimenting with different hyperoptimization loss functions

Tasks done in this PR

  • Added a new HyperLossclass with buit-in methods that can automatically perform statistics over replicas and then folds. The user can select statistics via replica_statistic and fold_statistic in runcard kfold.
  • Added new a statistics over replicas, \varphi^2_{k}, that (in addition to \chi^{2}) can be selected via loss_type option in runcard kfold.
  • Addressed #1894 (closed). Here, regardless the selected metrics to minimise by Hyperopt (\varphi^2 or \chi^{2}), we also print in tries.json (specifically within kfold_meta entry) a matrix (folds x replicas) of calculated \chi^{2} values named hyper_losses_chi2 and a vector (folds) of \varphi^2 values named hyper_losses_phi2.

Description

The implemented HyperLoss is instantiated within ModelTrainer and later on used in ModelTrainer.hyperparametrizable. The user must pass three paramaters that are set in the runcard:

  • loss_type: The type of loss to be used. Options are chi2 or phi2.
  • replica_statistic: The statistics over replicas to be used within each fold. For loss_type = chi2, it can assume the usual statistics: average, best_worst and std. Note: replica_statistic is inactive if loss_type = phi2 as \varphi^2_{k} is by definition a statistics over replicas.
  • fold_statistic: The statistics over folds. Options are: average, best_worst and std.
  • As discussed by @juanrojochacon, the calculation of any \varphi^2 statistics over folds is done with the reciprocal of the chosen function. For example, for loss_type: phi2 and fold_statistic: average, the figure of merit to be minimised is actually: \Huge \left( \frac{1}{n_\text{fold}} \sum_{k=1}^{n_\text{fold}} \varphi^2_{k} \right)^{-1}

The current implementation of \varphi^2_{k} is based on validphys functions. It is evaluated using only experimental data within the hold out fold (as expected).

Runcard examples

  • Default run: hyper_loss is set as the \chi^2 averaged over replicas and then over folds.
kfold:
      loss_type: chi2
      replica_statistic: average
      fold_statistic: average
      penalties:
        - saturation
        - patience
        - integrability
...
  • Setting \varphi^2_{k} within each k-fold and then set hyper_loss as the inverse of the max value of [\varphi^2_{1}, \varphi^2_{2}, ..., \varphi^2_{k}]:
kfold:
      loss_type: phi2
      fold_statistic: best_worst
      penalties:
        - saturation
        - patience
        - integrability
...

Notes

It must be merged after #1788 as the current hyperopt_loss branch has been created from trvl-mask-layers.

Merge request reports

Loading