Covariance matrix dependency has become circular
Created by: wilsonmr
I want to note more thoroughly the cause behind #437 (closed)
When one runs a fit they call theory_covmat_custom
which eventually depends on each_dataset_results_bytheory
, since we are using the results
tuple here then (thanks to my rushed PR #436) we would have to specify the new flag use_thcovmat_if_present: False
which is kind of annoying. In the short term one can of course set the default for this to be False however I could see this later causing potential issues with reports.
In #427 I started to separate out the covariance matrix from the data and the theory. But in the most minor way. I think as we have spoken about many times we should look to have the covmat as an entirely different object. Providers should be able to take theory
, results
and covmat
as seperate items and combine as appropriate. This would then, for example, mean that theory_covmat_custom
would only depend on theory
and results
and so wouldn't require one to specify anything wrt the covmat. Obviously this would also break pretty much everything which currently uses the results tuple.
I was thinking that also the theorycovmat_config
in the fit runcards should probably also be usable in a report setting, and that use_thcovmat_if_present
should become some kind of option within that which is similar to use_cuts: fromfit
. This could then offer the flexibility to calculate the covmat from scratch, which would be useful if one wanted to look at experiments not in the fit, with the theory covmat. It also I believe would make the entire theory covmat business in fits and reports a bit more unified and less confusing
I'm not quite sure how to proceed, I think extremely short term we should set use_thcovmat_if_present: False
as default in config.py
, I should document this somewhere as per @Zaharid's comment in #437 (closed). I think that the suggestion here and #427 are very closely related, however fixing every single providers which currently uses results
or some collect
ed version of it seems like a monstrous task: in #427 I only attempted to fix providers used in the report.
I guess we also want #427 out sooner rather than later since it is really about the formatting of the report and this issue concerns something a bit deeper about how validphys interacts with the covariance matrix
My suggestion is: Get #427 merged soon with the flag changed to be False
and some better documentation so that running a fit with the current set of runcards works and the rvp-comparefits
reports work (since the user specifies with vp-comparefit
to use_thcovmat_if_present
anyway and I think most people are using this interface to produce th covmat reports).
Have another PR which expands upon the seperation started by #427 of the data, theory and covmat and also unifies the theorycovmat_config
to be the flag which works with reports and fit runcards (and document this change)