Avoiding duplicated computations by having a single observable model

requested review from @enocera

Created by: APJansen

I'm looking into the first point, decoupling the computation of the observables from their masking and loss. Some questions @goord

Currently in _generate_experimental_layer this happens:

observables computed
masks applied one by one
masked observables concatenated
some rotation
loss computed

What does this rotation do?

And is it possible to change this to (at the cost of concatenating masks inside observable_generator): Here:

observables computed
UNmasked observables concatenated
(rotation?)

subsequent loss layer:

masks applied to concatenated observables
loss computed

Also I've probably seen this before but still I'm confused why there is both a mask applied directly to the observables, in _generate_experimental_layer, and also inside the LossInvcovmat itself?

Created by: goord

Yes these rotations are triggered by the 'data_transformation_tr', which is used if you represent the experimental data in a covariance-diagonal basis I guess. I'm not sure when this is actually used, and I'm not sure whether this code path is propoerly tested in the trvl-mask-layers branch...

Created by: goord

The mask in LossInvCovmat is not used for masking training/validation I think.

added escience label

mentioned in issue #1803

closed

reopened

Created by: APJansen

@goord Why does the experimental output have rotation=None while the others have rotation=obsrot, around here? Is that intentional? It interferes a bit with how I thought the observable computation would decouple from the masked loss.

Created by: goord

@goord Why does the experimental output have rotation=None while the others have rotation=obsrot, around here? Is that intentional? It interferes a bit with how I thought the observable computation would decouple from the masked loss.

This is a rewrite of line 289 in the master. I don't know why the diagonal basis is not used for the experimental output layer, perhaps @scarlehoff or @RoyStegeman can explain us.

If you look at n3fit_data.py you can see that in the diagonal basis, the training and validation covmats are being masked and then inverted, but the full covmat inverse (inv_true) is computed in the old basis.

Created by: scarlehoff

Because when they were separated it didn't really matter and it is decoupled from training / validation (the idea of diagonalising is to be able to do the split removing the correlations within a dataset between training and validation).

Created by: APJansen

Hm I don't fully understand, but is it ok to uniformize this? I now calculate all observables without any mask once, so using the same settings, and then mask the folds and the tr/val split afterwards. It's passing all the tests and also giving identical results for the main runcard.

Created by: goord

Hm I don't fully understand, but is it ok to uniformize this? I now calculate all observables without any mask once, so using the same settings, and then mask the folds and the tr/val split afterwards. It's passing all the tests and also giving identical results for the main runcard.

You can try the diag-DIS runcard to check the observable rotation: DIS_diagonal_l2reg_example.yml.txt

Created by: APJansen

Seems to work fine, and gives the same results as trvl-mask-layers.

Created by: scarlehoff

I don't fully understand

The chi2 (should not) depend of the diagonalization, since the total covmat is only used to report the total chi2, nobody cared about diagonalising that because it was not needed.

but is it ok to uniformize this?

Yes because see above.

Created by: APJansen

Ok perfect, thanks :)

Created by: APJansen

@scarlehoff @Radonirinaunimi Positivity is included in the validation model, I remember we discussed this before, and if I remember correctly there was some disagreement on whether this was necessary or not, is that right? If I remove it, I get an error from this line, which can be fixed by changing fitstate.validation to fitstate._training, after which it runs normally (though I haven't done any comparisons).

Right now I'm thinking that to remove the repeated calculation of observables, the easiest is to combine the training and validation models into one model that computes both of their losses, adding a "_tr" and "_val" postfix and filtering as appropriate when summing to get the final train/val losses. The experimental one can stay separate as the performance loss there is negligible.

Does that sound ok?

Of course it would be nicer to instead just have one model and 3 different losses, but that will take longer to implement.

Created by: scarlehoff

I don't understand what you mean. The easiest way looks much more complex to me since you need to filter out things and any bug there will "break" the validation.

Created by: scarlehoff

Also, I'm not completely sure you can achieve your goal here?

You need to compute everything twice for every epoch just the same.

Avoiding duplicated computations by having a single observable model

Goal

Current implementation

Models creation

Models usage

Changes proposed

Activity

Avoiding duplicated computations by having a single observable model

Goal

Current implementation

Models creation

Models usage

Changes proposed

Merge request reports

Activity