Skip to content
Snippets Groups Projects

DEPRECATED: General theory covmat

Closed Emanuele Roberto Nocera requested to merge general_theory_covmat into master

Created by: RosalynLP

Replaces #646. Aim is to allow arbitrary general theory covmats to be loaded from file alongside the scale variation covmat.

Merge request reports

Closed by avatar (Apr 8, 2025 7:19am UTC)

Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Created by: RosalynLP

    The main points to work on are:

    • Get rid of user-specific location for stored theory covmats
    • Allow choice of theory covmat type for scalevar cov (e.g. block diag or diag as well as full) - although this does seem a bit redundant at this stage and I'd be happy to get rid of this functionality
    • Allow for the scenario where not all covmats are present
    • Introduce another function for user_covmat to be loaded from file, so the user can specify any covmat outwith the choices of scalevar, top, higher twist or some combination.
    • Deal with the changing name of the dumped file which depends on dataspecs
  • requested review from @enocera

  • requested review from @enocera

  • requested review from @enocera

  • Created by: RosalynLP

    @Zaharid I think the main thing remaining here is to get rid of the user specific location for the stored theory covmats, i.e. move them to the server. Could you help point me towards an example of how to do this please?

  • Created by: RosalynLP

    @Zaharid as I mentioned at the code call I am having problems using the loader to load the theory covmat.

    I uploaded a covariance matrix for top mass uncertainty using validphys, found at https://vp.nnpdf.science/IeGM9CY8RxGcb5r6bIEYlQ==/topthcovmat.csv.

    I am trying to load this using the loader in theorycovariance.construction.py line 456, but get the error

      File "/home/s1303034/nnpdf/validphys2/src/validphys/loader.py", line 520, in check_vp_output_file
        raise LoadFailedError(f"Could not find '{filename}'") from e
    validphys.loader.LoadFailedError: Could not find 'IeGM9CY8RxGcb5r6bIEYlQ==/topthcovmat.csv'
    

    and the same if I use the full address with https://vp.nnpdf.science in front.

    Is this because this isn't strictly an output file?

  • Created by: wilsonmr

    @RosalynLP I can download it fine using the FallbackLoader are you just using Loader ? Note that this just checks for the file locally

    probs best practice would be to create a production rule which uses the loader of the config class because that will check locally and remotely

  • Created by: wilsonmr

    EDIT: ignore the fact that I don't load it correctly, I was just checking the file was a csv

    >>> l = FallbackLoader()
    >>> l.check_vp_output_file("IeGM9CY8RxGcb5r6bIEYlQ==/topthcovmat.csv")
    PosixPath('/Users/michael/conda/envs/nnpdf/share/NNPDF/vp-cache/IeGM9CY8RxGcb5r6bIEYlQ==/topthcovmat.csv')
    >>> import pandas as pd
    >>> pd.read_csv('/Users/michael/conda/envs/nnpdf/share/NNPDF/vp-cache/IeGM9CY8RxGcb5r6bIEYlQ==/topthcovmat.csv')
        experiment  ...                BIGEXP.25
    0      dataset  ...  CMSTOPDIFF8TEVTTRAPNORM
    1           id  ...                        9
    2   experiment  ...                      NaN
    3       BIGEXP  ...                0.0715461
    4       BIGEXP  ...                 0.103566
    5       BIGEXP  ...                 0.361396
    6       BIGEXP  ...                0.0715461
    7       BIGEXP  ...                 0.103566
    8       BIGEXP  ...                 0.361396
    9       BIGEXP  ...              4.76801e-05
    10      BIGEXP  ...              0.000110914
    11      BIGEXP  ...              0.000155474
    12      BIGEXP  ...              0.000189919
    13      BIGEXP  ...              0.000208396
    14      BIGEXP  ...              0.000212287
    15      BIGEXP  ...              0.000189961
    16      BIGEXP  ...              0.000155095
    17      BIGEXP  ...              0.000110225
    18      BIGEXP  ...              4.80463e-05
    19      BIGEXP  ...              3.63837e-05
    20      BIGEXP  ...               0.00014501
    21      BIGEXP  ...              0.000193685
    22      BIGEXP  ...              0.000241234
    23      BIGEXP  ...              0.000262788
    24      BIGEXP  ...              0.000263333
    25      BIGEXP  ...              0.000240536
    26      BIGEXP  ...              0.000196785
    27      BIGEXP  ...              0.000142848
    28      BIGEXP  ...              3.64091e-05
    
    [29 rows x 29 columns]
    >>> 
    
  • Created by: RosalynLP

    Yes I was just using Loader... but with FallbackLoader you need to specify some local PosixPath right? Which is not something we want in the main code

  • Created by: wilsonmr

    Urm I'm not sure exactly what you mean there but the point is that if I were to use e.g check_fit the loader will return a FitSpec because it's a well defined object. If we look at check_vp_output_file it returns the path to the vp-output file you requested (which is really the best it can do because this could be a table, a plot, something else). But this isn't the same as a hardcoded local path, and if we use the FallbackLoader it also downloads the actual file to some sensible place so schematically you could have:

    from validphys.loader import FallbackLoader
    l = FallbackLoader()
    path_to_covmat = l.check_vp_output_file("IeGM9CY8RxGcb5r6bIEYlQ==/topthcovmat.csv")
    covmat = some_special_tableloader_function_that_gets_index_correct(path_to_covmat)

    Where you wrap that up is kind of up to you. At the moment validphys.config doesn't import validphys.tableloader and neither does validphys.loader

    If I look back to the alpha s stuff in validphys.paramfits then this functionality appears to be wrapped up in the ConfigParser so perhaps that would be fine?

    Maybe make the tableloader import inside the production rule which checks the vp_output and loads the table since it seems quite othogonal to the rest of the module.

  • Created by: RosalynLP

    Urm I'm not sure exactly what you mean there but the point is that if I were to use e.g check_fit the loader will return a FitSpec because it's a well defined object. If we look at check_vp_output_file it returns the path to the vp-output file you requested (which is really the best it can do because this could be a table, a plot, something else). But this isn't the same as a hardcoded local path, and if we use the FallbackLoader it also downloads the actual file to some sensible place so schematically you could have:

    from validphys.loader import FallbackLoader
    l = FallbackLoader()
    path_to_covmat = l.check_vp_output_file("IeGM9CY8RxGcb5r6bIEYlQ==/topthcovmat.csv")
    covmat = some_special_tableloader_function_that_gets_index_correct(path_to_covmat)

    Where you wrap that up is kind of up to you. At the moment validphys.config doesn't import validphys.tableloader and neither does validphys.loader

    If I look back to the alpha s stuff in validphys.paramfits then this functionality appears to be wrapped up in the ConfigParser so perhaps that would be fine?

    Maybe make the tableloader import inside the production rule which checks the vp_output and loads the table since it seems > quite othogonal to the rest of the module.

    OK thank you so much, I don't know why I was confused, I think this makes sense now! I already have some table loading function which takes the file and loads it how I want, so that is all fine, I think I was just unsure of the distinction between Loader and FallbackLoader.

    It turns out one of the reasons I was having problems was my .netrc file was not up to date, so it was unable to find the file because I didn't have the access to the server.

  • Emanuele Roberto Nocera changed title from [WIP]: General theory covmat to General theory covmat

    changed title from [WIP]: General theory covmat to General theory covmat

  • Created by: RosalynLP

    Although the covmats can now be loaded from the server they are set in stone as being from the output of a specific vp upload. For a generic "user covmat" the user might want to make an arbitrary theory covmat and include it in a fit. They would then need to upload their own covmat to the server and alter the vp path in the code in order to use it, which is not ideal. Is there any way in which we could keep a permanent link to a covmat folder which could then be added to at a later date, so that people could add their own covmats? Then maybe they could pass in the filename in the runcard and their chosen covmat name could be loaded from this folder?

  • Created by: RosalynLP

    To do:

    • Default locations for nuclear and deuteron covmats
    • Documentation
    • Warning if you specify use_user_uncertainties but not user_covmat_path and vice versa
    • Tidy up PR and get rid of JLAB stuff etc.
  • mentioned in merge request !991 (merged)

  • Emanuele Roberto Nocera changed title from General theory covmat to DEPRECATED: General theory covmat

    changed title from General theory covmat to DEPRECATED: General theory covmat

Please register or sign in to reply
Loading