Group datasets by experiment PLOTTING label
Created by: wilsonmr
closes #413 (closed), closes #418 (closed)
closes #174 (closed) (the dataspecs plots already exists)
At the moment this is EXTREMELY basic, but it explains better my approximate approach to addressing the problem that the plots by experiment in vp reports are essentially redundant.
Instead of rewriting the various actions (another possibility which shouldn't be discounted/forgotten) I aimed to keep the idea of experiment as some arbitrary combination of datasets. There is a production rule which for a given fit, looks at the plotting info and then groups those datasets into experiments. Then one can do something like
{@with experiments_from_plotting@}
{@plot_experiments_chi2@}
{@endwith@}
and the plot will have bars according to the plotting experiments. I then added a provider fixup_fitthcovmat
which handles the loading of the fitthcovmat
and fixes the experiment headers to match those of the experiments. Then anything which used fitthcovmat
now uses fixup_fitthcovmat
and in DataResult
I now take the cross section of fixup_fitthcovmat
for either experiment or dataset.
The final step would be introducing some minimal #356 (closed) so that one can calculate total chi2s required for some tables and the summarise_fits
action. All of this is currently a bit gross at the moment, can probably be done in a cleaner way, or in fact I could stop with this approach and try something completely different, it has however produced this: (results include th covmat and may or may not be correct - they look rather sensible compared to not using th covmat but that's not very rigorous)
Merge request reports
Activity
requested review from @enocera
requested review from @enocera
requested review from @enocera
747 751 """ 748 752 return label 749 753 754 def produce_experiments_from_plotting(self, fit): 755 """Used to produce experiments from the fit, where each experiment is a group of datasets 756 according to the plotting info 757 """ 758 #TODO: consider this an implimentation detail 759 from reportengine.namespaces import NSList 760 761 with self.set_context(ns=self._curr_ns.new_child({'fit':fit})): 762 _, experiments = self.parse_from_('fit', 'experiments', write=False) Created by: Zaharid
@wilsonmr Please rebase on top of master and force push to get the tests to work. Would be good to add tests with the theory covmat.
I am also thinking that if we are going to use a theory covmat everywhere, it should be computed by vp rather than loaded somewhere with potentially incompatible data. And in the case where it is loaded, we should be able to check that e.g. the cuts match. One way to go about that is storing the corresponding namespace information on the cuts, in line with #224.
Created by: wilsonmr
sure, the loading is a bit problematic. The only possible problem is if it is to be computed then basically every report from now on will require up to 9 theories to run. Like you say, at the very least the loaded one should check that it's consistent with the cuts.
There is still more to do with this if it is the direction you want to go for now - this will break the total chi2 values everywhere because they will miss theory correlations between experiments (as defined in this branch)
Created by: wilsonmr
Last Commit:
Ok some decisions made here:
- use_theorycovmat should only be a bool, the string option made it messy and was too prone to error
- since
experiments
in the global namespace couldn't be overridden, I moved it todataspecs
specifically so it didn't interact with the production rules - removed the
plot_pdfs_phi
since it needed to be sorted anyway closes #174 (closed) (the dataspecs plots already exists) I also added a phi table, and so changedphi_data
also returns Npoints - if theory covmat is to be used then I always calculate covmat and sqrt covmat 'externally' in the sense that I get the sqrt from numpy function, but this seems consistent with the so called internal function
GetSqrtFitCovMat
in fact I need to delete this function.
Finally here is a new
vp-comparefits
report https://vp.nnpdf.science/T9J7ouncRqK8WJY4avGGmA==(compare to its old version https://vp.nnpdf.science/WfioqoYSSl2DCFyr1xgEsw==/#chi2-by-experiment the datasets are sometimes ordered differently but they seem to have same numbers, less/no
NaN
s and the new experiment plots look sensible)To do: (provided this passes tests)
- Add covmat checks
- Clean up some redundant providers/ change documentation more thoroughly
- add covmat tests
- add to vp-comparefits report to indicate what covmat is used
-
The table
fits_chi2_table
total is still broken, calc total in more sensible way.
Created by: wilsonmr
I just wanted to have some kind of distinction between these experiments and the fitted experiment, the latter of which has some kind of meaningful bearing on the total chi2. I guess I need to change the tests to account for the new DataResult layout. I didn't quite do what you said wrt
data, theory predictions, covmat -> chi2
here but vaguely moved towards it, whereDataResult
is a container of data and covmat andresult
is a container of that and the theory predictionCreated by: wilsonmr
As soon as #356 (closed) becomes a thing then this becomes less confusing, perhaps it would be sufficient to call the new providers
*_experiments_*
whatever and then on any provider used in the report with the old meaning of experiments, put a comment and a warning that the provider needs to be updated withdata
when it exists?and delete for example
plot_fits_experiments_chi2
which is either the same asplot_fits_exp_by_plotting_chi2
for old fits or is useless for newer fits...