Multi dense logistics
Created by: APJansen
The aim of this PR is to make all the remaining changes necessary to enable the implementation of the MultiDense
layers, without changing the numerics at all.
This boils down to stacking the different replicas as soon as possible (i.e. here), and making any changes that result from that.
TODO:
- photons: see below for a detailed discussion.
- Fix loading of weights in this block. I'm not sure how that's working at the moment, all replicas read from the same file. Is this something that is actually used? @scarlehoff ?
- verify that hyperopt is working
- figure out what to do with the line
n3pdfs.append(N3PDF(pdf_models, name=f"fold_{k}"))
[here], see below for a detailed discussion (https://github.com/NNPDF/nnpdf/blob/0c0ee66eab9f64e081e4747ce2d25abc018e59a6/n3fit/src/n3fit/model_trainer.py#L952)
Comments
The biggest changes are the addition of 3 methods to MetaModel
: get_replica_weights
, set_replica_weights
and split_replicas
. For now these rely on the different replicas being different models. Once the MultiDense
layer is implemented that won't be the case, and the code in these functions will need to change to extract the right entry in the replica axis of all weights, but the changes required should be limited to these functions.
The reason for split_replicas
is that if I don't do this and keep everything as a single model, it will require a lot more changes, also in valid phys, and since performance wise the difference is negligible, I thought it was cleanest to split into separate replicas after training.
Status
Work in Progress.
Currently I'm a bit stuck on stuck on this, it runs (at least if I avoid the points above by just running the basic runcard), and during training the results are identical to master. Also the weights of the single replica models are identical to the original model, but the best chi2 reported per replica after training are very different.
Later
Once the above is done, the joining of replicas can be pushed even further back before actually including the MultiDense
layers, namely directly after the NN layers. Whether I'll do it in this PR or the final one will depend on if it changes the numerics or not (I think it may).