Clarify pseudodata rebuilding workflow
Created by: Zaharid
We have a number of tools relying on having explicit pseudodata for the fitted replicas (both the fluctuation and the training validation split, and then also implicitly the cuts). We have a way of storing all that which is not used often by default.
And we have a way of “rebuilding” the data relying on things like the algorithms for splitting being unmodified and random streams being stable.
We have a tendency to break the later one which also adds constraints on how to modify things. I believe we should discuss what to do with it. As it is, I consider any modifications that change the pseudodata a breaking change requiring a version bumps (but covered by the compatibility policy which explicitly allows for randomly state changes). But perhaps we should simply deprecate that and mandate that pseudodata needs to be stored if we intend to use it later.