Speed up the fit by separating tr/vl/exp fktables
Created by: scarlehoff
The motivation for having one single experiment layer and then applying the mask during the training was to reduce the memory footprint of the training. Now we are at about ~3.5 GB.
One could apply the mask directly to the fktable (in the future, validphys could provide masked fktables and covmats/data arrays) which reduce to a 70% the time per epoch (from 0.097s to 0.070s in my computer). In exchange the memory increases to ~4.5GB (and it will grow more as more datasets are added)... but for now I think it makes sense to swap memory for speed.
In the future, if we are ever again in a situation in which memory is the bottleneck, we could always drop the full fktable during the fit and recover it later (it is used only to compute the experimental loss at the end of the fit).