Skip to content
Snippets Groups Projects

Seaquest data

Merged Emanuele Roberto Nocera requested to merge seaquest into master

Created by: tgiani

Implementation of seaquest data from https://arxiv.org/pdf/2103.04024.pdf. There are a number of things to be checked

  1. the data are not availbale from hepdata, so I copied them from the paper (please double check)
  2. the data are converted into data for distributions differential in hadronic rapidity and invariant mass using Eqs.(4.6), (4.7) from https://arxiv.org/pdf/1009.5691.pdf. This is done with a python script saved in the rawdata folder
  3. there is a single source of systematic which is considered to be fully correlated as specified in the paper. Also, my understanding is that Eq.(9) of the paper gives the correlation matrix between statistical uncertainties. This is not implemented yet. Should I include it? (I guess so but I first want to check, for the 2001 data such covariance matrix is not implemented)
  4. I don't understand the very last paragraph of the paper, in particular Eq.(10). This might be relevant for the computation of theory predictions with apfel

Merge request reports

Merged by avatar (Apr 3, 2025 5:16pm UTC)

Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • requested review from @enocera

  • requested review from @enocera

  • Created by: tgiani

    regarding point 4), I think we need to correct the theory predictions in order to account for the acceptance. If so, I think the easier way would be to implement Eq.(10) as an additional k-factor, given in Table 3

  • Created by: wilsonmr

    @tgiani I have separated out the commits for the nuclear/deuteron uncertainties to #1183, at some point it probably makes more sense to use that as a base branch (or just wait until it gets merged and rebase on master..)

  • @tgiani: I went through the paper and here are some suggestions.

    1. the data are not available from hepdata, so I copied them from the paper (please double check)

    It seems to me that you copied the data correctly.

    1. the data are converted into data for distributions differential in hadronic rapidity and invariant mass using Eqs.(4.6), (4.7) from https://arxiv.org/pdf/1009.5691.pdf. This is done with a python script saved in the rawdata folder

    Why do you need a python script? Cannot you perform the necessary kinematic transformations in the filter file FTDY.cc? In order to streamline the implementation I suggest to remove the python script and modify instead the filter file.

    1. there is a single source of systematic which is considered to be fully correlated as specified in the paper. Also, my understanding is that Eq.(9) of the paper gives the correlation matrix between statistical uncertainties. This is not implemented yet. Should I include it? (I guess so but I first want to check, for the 2001 data such covariance matrix is not implemented)

    Your understanding matches mine. Please proceed and implement the covariance matrix for statistical uncertainties. There's actually a small discrepancy between the diagonal values of the matrix in Eq.(9) and the square of the statistical uncertainties reported in Table 1, but I guess that we can live with it and use the matrix in Eq.(9) tout court.

    1. I don't understand the very last paragraph of the paper, in particular Eq.(10). This might be relevant for the computation of theory predictions with apfel

    The thing that I don't understand is how Wu-Ki Tung, who passed away in 2009, was able to provide them with the code for the NLO computation (as they explicitly state), unless he came back from afterlife. Apart from that, I'd say that the problem concerns how to compute the theoretical predictions, not how to correct the data. I think that I can take care of that, either by implementing explicitly Eq.(10) or by interpolating the acceptance correction.

  • Created by: tgiani

    @wilsonmr ok, thanks!

  • Created by: tgiani

    @enocera thanks. Sure, I can perform the kinematic transformation directly in the filter if you think it s better

  • @enocera thanks. Sure, I can perform the kinematic transformation directly in the filter if you think it s better

    @tgiani: I think it is, if you don't mind. Thanks.

  • Created by: tgiani

    @enocera I was multiplying the data by a jacobian to convert them in data for distributions differential in rapidity, however this is wrong because these are ratios, so the jacobian simplifies between numerator and denominator. So there are no kinematic transformation to be done. The data are those in the table of the paper and the only additional thing to be computed is the hadronic rapidity. I ve removed the python script and add the covariance matrix for statistical uncertainties as well

  • Emanuele Roberto Nocera added 1 deleted label

    added 1 deleted label

  • requested review from @enocera

  • Created by: Zaharid

    Seems this has grown a fair bit. What are all these new operators for? Could you give a high level summary of what are all the files changed?

  • @Zaharid cc @tgiani The story is as follows. SeaQuest delivers a measurement of sigma(pd)/2sigma(pp) for six points as a function of xb, xt and M (all averaged), see Table 1 in 2103.04024. So far so good, this is the same observable measured by NuSea (E866) already implemented in NNPDF3.1. Unfortunately, as they say in the paper, the data in Table 1 is not corrected for detector acceptance effects. This implies, as they say towards the bottom of the second column on page 7, that the theoretical prediction must be constructed according to Eq.(10). Each measured bin is divided in 10 sub-bins; each of these bins is computed with different kinematics and receives an acceptance correction (see Extended data Table 3). After a bit of thinking, I concluded that the way to implement this kind of information is to compute 20 FK tables, 10 for the numerator of the observable and 10 for the denominator. Each of these 10 FK tables corresponds to one of the ten columns (with six rows) of Extended data Table 3. The observable then becomes a COMPOUND observable, should appropriate operators be defined.

    I've defined two operators. a) COM: this takes the 20 FK tables and computes Eq.(10). Note that this allowed me to implement the acceptance factors by means of ACC K-factors (rather than to hard-code them in apfel), where each number multiplies a bin in each of the 20 FK tables. The definition of the operator has been propagated to the documentation. b) SMT: this takes only 10 of the FK tables (either for the numerator or for the denominator) and computes their sum. This operator is required to estimate the nuclear uncertainty due to the deuteron in the numerator, as it allows one to compute the numerator with a proton-proton or with a proton-deuteron PDF.

    As far as I can tell the implementation is meaningful, for example see here two data/theory comparisons

    Finally: the measurement requires the addition of a nuclear correction for the deuteron in the numerator. This requires in turn to iterate the fit to the deuteron data with the inclusion of SeaQuest, something that I hadn't considered in the first instance. Incidentally, deuteron corrections for all of the other data sets have to be updated accordingly. This is on its way. This has been done. Needless to say that, with hindsight, I would have strongly voiced against the decision to include SeaQuest in NNPDF4.0, given the non-negligible (and mostly unnoticed/unappreciated) work that its implementation required.

  • Ah, and then there's a (genuine, I guess) build failure that requires to be fixed.

  • Emanuele Roberto Nocera changed title from [WIP] Seaquest data to Seaquest data

    changed title from [WIP] Seaquest data to Seaquest data

  • As far as I'm concerned, this PR is good to go, perhaps pending successful tests and rebase on master (with updated runcards). Here's a data/theory comparison when the extra nuclear uncertainty is included (via deweighting): https://vp.nnpdf.science/RGfy_dVZQa6xYT72MFW_7w==

  • Emanuele Roberto Nocera approved this merge request

    approved this merge request

  • Emanuele Roberto Nocera removed review request for @enocera

    removed review request for @enocera

  • Created by: Zaharid

    Are the various plotting files for all the bins and AUXILIARY experiments for debugging purposes?

  • No they aren't. The PLOTTING files are there just to prevent the builds to fail. The corresponding DATA and SYSTYPE files, instead, are there because they are needed for the generation of the FK tables.

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
Please register or sign in to reply
Loading