Add a library of process dependent options (!1949) · Merge requests · Emanuele Roberto Nocera / nnpdf

Emanuele Roberto Nocera requested to merge add_a_library_of_process_options into final_reader_for_new_commondata_mk2 Feb 19, 2024

Created by: scarlehoff

Right now, how validphys understands the data depends on how the process is defined.

This is done in various different places:

For cuts in filters.py: https://github.com/NNPDF/nnpdf/blob/f9e8c4b561a7f2371a4df7812d32820a1141671f/validphys2/src/validphys/filters.py#L19

For labels in the (old) parser: https://github.com/NNPDF/nnpdf/blob/f9e8c4b561a7f2371a4df7812d32820a1141671f/validphys2/src/validphys/commondataparser.py#L17

How to create the xq2map depends on the kinematic transformation: https://github.com/NNPDF/nnpdf/blob/f9e8c4b561a7f2371a4df7812d32820a1141671f/validphys2/src/validphys/plotoptions/kintransforms.py#L72

And there's somewhere also a list of process description labels. Then there's DIS which can be DIS_NC, DIS_CC, DIS_ALL but sometimes they are considered equal sometimes they are not. etc

That was a result of things done in several iterations and the necessity of producing results so we never had the time to sit down and put everything together in a sensible manner.

Now with the new commondata format, while there is a lot of stuff which is based on the old ways (kinematics and results transformations scattered around the code). We have an opportunity to start doing things well instead of adding yet another point of chaos. Not only we have an opportunity but we have a necessity since right now the xQ2 plot is broken for ttbar and it there is no process type for DIS+J #1825

Note that some of the previous ones are redundant in the new format. No need for results transformations or custom labels because the person implementing the data can decide which kinematics variables to implement, plot and cut upon. The only important thing is the variables can be understood for the given process.

My proposal here is the following: create one single library of process_options.py which will collect all process-dependent options. Newly implemented data will be using this. The only public interface of this module is the Processes enum. When a dataset is loaded, the process type will be read and check against the accepted variables for that process. If it works, it's all ok. If it doesn't, the person implementing the dataset will need to either add a way for the process to understand the new one or choose a different set of variables.

I've put a DIS example which covers a few of the situations that we would find.

I haven't put this directly on the reader because this is just a proposal I came up with.

The only other option I can think of is that we embrace the chaos but that would require either restricting new data to the same set of variables that old data used or including the necessary variables for the x-Q2 mapping into the dataset.

Add a library of process dependent options

Merge request reports