In the future we want to be able to incorporate general theory covmats into fits, i.e. contributions from nuclear uncertainties, higher twist uncertaintes etc.
Scale variation theory covmats are computed at the vp-setupfit level via a production rule produce_nnfit_theory_covmat in config.py. We want to be able to add arbitrary covmats from file, in other words ones which are not computed inside validphys each time.
It would be simplest to add per-dataset contributions (i.e. block diagonal by dataset) by loading in a covmat for each dataset. However, the correct cuts would need to be applied - presumably this can be done similarly to how experiment covmats are loaded in.
However, for more general theory covmats which include correlations between experiments things will be more difficult with the current set-up in validphys. Ideally we would want to load in the total arbitrary theory covmat and add it to the scale variation covmat at theproduce_nnfit_theory_covmatstage. But this is after cuts have been applied so we would need a way to make the correct cuts on the whole theory covmat at this late stage, and I cannot yet see a clear way to do this.
Designs
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
The new experiment setup will hopefully make this easier but I think Rosalyn is wanting to use this with the old code, we just discussed and in the end we think that for the purpose of cutting a total covmat something like this should suffice:
I guess I am concerned that if you keep piling on top of a design that is know to be problematic, it is going to eventually collapse in a way that nobody knows how to fix...
One thing I was thinking about was the possibility of having a commondata-esque object related to theory covariance matrices
Take for example the 9-point covmat, in construction.py:
defcovmat_9pt(name1,name2,deltas1,deltas2):"""Returns theory covariance sub-matrix for 9pt prescription, given two dataset names and collections of scale variation shifts"""ifname1==name2:s=0.25*sum(np.outer(d,d)fordindeltas1)else:s=(1/12)*(np.outer((deltas1[0]+deltas1[4]+deltas1[6]),(deltas2[0]+deltas2[4]+deltas2[6]))+np.outer((deltas1[1]+deltas1[5]+deltas1[7]),(deltas2[1]+deltas2[5]+deltas2[7])))+(1/8)*(np.outer((deltas1[2]+deltas1[3]),(deltas2[2]+deltas2[3])))returns
It seems to me that deltas here could be calculated once for a given input PDF and stored like a systematics file. If every dataset had it's respective deltas then the covmat block for two datasets can be constructed quite easily and doesn't rely on saving a constructed covmat which may or may not have cuts applied or have been made with different datasets
I have no idea how the higher twists covmat is constructed, but @RosalynLP is there some point of the construction which looks like folloiwng?
I am not sure how stable those things will be and how many variations and variations of variations we are going to want, so I wouldn't assimilate it to commondata. However perhaps it could be linked to theories somehow.
Well, with the scale variations covariance matrices I don't see how it would be unstable to save a copy of a table for each dataset which was N_data x N_shifts where the columns would be like:
+0 | ++ | +- | -0 | -+ | -- | 0+ | 0-
where each label refers to a delta between that theory and theory 00, the variations would be on the PDF and PTO of the input theories (the latter we don't need to worry much about for the time being). Adhering to the covmat reg convention if I call table above A then the covmat construction is either
1/normalisation * A A.T
for same processes or
1/normalisation * A_tilde A_tilde.T
where A_tilde a new table whose columns are linear combinations of A according to the point prescription (like tilde_A[:, '+X'] = sqrt(2) * (A[:, '+0'] + A[:, '++'] + A[:, '+-']) etc.). As a side note: if the table was a Dataframe with column labels like "++" then it would be far less ambiguous what contruction.py was doing. I believe if there were variations then the variations would be happening at the level of the construction not the deltas - which as far as I can tell were pretty much constant throughout the theory covariance project
Surely this is way better than saving a cut covmat which is only valid if the datasets are loaded in the same order, with the same cuts etc.?
This would be seperate file to commondata and theory. But I suppose would be more similar to theory in the sense that there could be multiple files associated with a single dataset which were square roots of different theory covariance matrices. So the table above would be like <dataset_id>_pointpresc_delta.dat but (provided it can be cast in a similar format) could also have <dataset_id>_higher_twist.dat or whatever
@wilsonmr I think what you are saying amounts to storing the covariance matrices as nuisance parameters rather than as covariance matrices right? In the case of the theory covmat the nuisance parameters can be constructed from linear combinations of the deltas. Then you have some nuisance parameters
\beta
with eigenvalues
\alpha
and you construct
S = \sum_n \alpha_n \beta_n \beta_n^T
. But I think what you are suggesting is really overcomplicating matters, and although in principle we can write any covariance matrix in terms of some nuisance parameters I think this might just create trouble for us down the line by being too inflexible. At the end of the day by far the simplest thing is we have a load of tables saved, one for each covariance matrix, with no cuts applied, and we have flags which say whether to load in specific ones. Then we apply the cuts. This is the most general way of doing it and so is likely to lead to the fewest complications down the line and generally the least stress. I think it's also worth remembering that MHOUs are a special case of theory uncertainties which can be constructed entirely from given theory vectors but in general this won't be the case and we could have some theory covariance matrix supplied externally, much like experimental covariance matrices are currently. Then we would have to deconstruct them into nuisance parameters if we were using the above framework.
no, I'm just saying let's store deltas, there are 9 in total and the different prescriptions amounts to putting some of them them together in different ways as per construction.py.
I think we should discuss this in person, I'm not suggesting anything to do with nuisance parameters I don't think but it's really boring to try and type something I can write on a blackboard in 5 minutes