Design Improved commondata fomat
Created by: Zaharid
There has been some renewed talk to improve the data implementation technology. One of the aspects is the commondata format. Here are some more or less agreed upon desiderata, that have been discussed in the past.
- Metadata all in one place (i.e. PLOTTING file should become the header containing all relevant information that isn't the data itself). SYSYPEs should probably go here as well. Should be separate from the data itself. There was some not rough prior art here https://github.com/NNPDF/buildmaster/pull/101 that never got merged.
- References in the metadata that are used by vp to produce tables (including in latex).
- Support for N kinematics (this will require some changes in vp and probably retiring the cpp codepath).
- Bins as opposed to central values. #1006 (closed)
- Support variants e.g. for bugfixes #494 (closed). This ties in with the defaults cc @siranipour.
On top of that there are concerns about duplication of information between commondata and the theory predictions toolchain. To me the logical conclusion of that train of though is that if we don't want duplication (e.g. in regards to the binning) then the commondata format (metadata) should contain all the information required to make a theory prediction except for the theory parameters such as alpha_s and PDF. The idea would be than it could then be mechanically converted into the partial input for some montecarlo. We should keep in mind that a monte carlo run is more or less in one to one correspondence with an fktable, but an experimental measurement could involve a bunch of these, reduced with some operation, such as e.g. ratios.
I also don't really know what other things would be required. This would certainly eat into other creative formats that have accumulated such as the COMPOUND files, which in this picture would also be absorbed into the metadata, and probably together with much of what is in the current "runcards" repository. I don't know if this is a good idea but certainly see a number of advantages.