Emanuele Roberto Nocera requested to merge multi-dense-logistics into master Oct 16, 2023

Created by: APJansen

The aim of this PR is to make all the remaining changes necessary to enable the implementation of the MultiDense layers, without changing the numerics at all. This boils down to stacking the different replicas as soon as possible (i.e. here), and making any changes that result from that.

TODO:

photons: see below for a detailed discussion.
Fix loading of weights in this block. I'm not sure how that's working at the moment, all replicas read from the same file. Is this something that is actually used? @scarlehoff ?
verify that hyperopt is working
figure out what to do with the line n3pdfs.append(N3PDF(pdf_models, name=f"fold_{k}")) [here], see below for a detailed discussion (https://github.com/NNPDF/nnpdf/blob/0c0ee66eab9f64e081e4747ce2d25abc018e59a6/n3fit/src/n3fit/model_trainer.py#L952)

Comments

The biggest changes are the addition of 3 methods to MetaModel: get_replica_weights, set_replica_weights and split_replicas. For now these rely on the different replicas being different models. Once the MultiDense layer is implemented that won't be the case, and the code in these functions will need to change to extract the right entry in the replica axis of all weights, but the changes required should be limited to these functions.

The reason for split_replicas is that if I don't do this and keep everything as a single model, it will require a lot more changes, also in valid phys, and since performance wise the difference is negligible, I thought it was cleanest to split into separate replicas after training.

Status

Work in Progress.

Currently I'm a bit stuck on stuck on this, it runs (at least if I avoid the points above by just running the basic runcard), and during training the results are identical to master. Also the weights of the single replica models are identical to the original model, but the best chi2 reported per replica after training are very different.

Later

Once the above is done, the joining of replicas can be pushed even further back before actually including the MultiDense layers, namely directly after the NN layers. Whether I'll do it in this PR or the final one will depend on if it changes the numerics or not (I think it may).

Multi dense logistics

TODO:

Comments

Status

Later

Merge request reports