Skip to content

Speed up convolution

Emanuele Roberto Nocera requested to merge faster_pandas_convolution_1 into master

Created by: scarlehoff

This is a very small change which speeds up the hadronic convolution (the more points there are the bigger is the effect).

Master: LHCBWZMU7TEV: 1.6s ATLAS_1JET_8TEV_R04: 8.5s

First commit of this PR: LHCBWZMU7TEV: 1.3s ATLAS_1JET_8TEV_R04: 6.8s

Second commit (but adding a previous call to force the cache to be filled): LHCBWZMU7TEV: 0.53s ATLAS_1JET_8TEV_R04: 2.46s

It could be sped up even more by obfuscating the code quite a bit and basically dropping pandas for the low-level stuff.

In any case, profiling the convolution, actually the single biggest effect is read_csv (which takes 0.6s for LHCBWZMU7TEV and 3.5 for ATLAS_1JET_8TEV_R04) so not sure it's worth it until that's not solved.

Now, there might be a reason why lru_cache was not used in load_fktable, that's why I kept it in a separate commit (and to show the other change already does something).

The code I've used for the benchmark should anyone want to test in their computers:

    from NNPDF import ThPredictions as oldPredictions
    from validphys.api import API
    from validphys.convolution import predictions
    import numpy as np
    from time import time

    args = {"dataset_input": {"dataset": "LHCBWZMU7TEV"}, "theoryid": 200, "use_cuts": "internal"}
    args = {"dataset_input": {"dataset": "ATLAS_1JET_8TEV_R04"}, "theoryid": 200, "use_cuts": "internal"}
    ds = API.dataset(**args)
    pdf = API.pdf(pdf="NNPDF40_nnlo_as_01180")
    l_pdf = pdf.load()

    start = time()
    preds = predictions(ds, pdf)
    interlude = time()
    l_data = ds.load()
    old_preds = oldPredictions(l_pdf, l_data).get_data()
    final = time()
    print(f"Are results equal?: {np.allclose(preds, old_preds, rtol=1e-3)}")
    print(f"Time for vp: {interlude-start:.4f}")
    print(f"Time for libnnpdf: {final-interlude:.4f}")

Merge request reports

Loading