Fk refactor
Created by: APJansen
The idea
It's a relatively small change, only affecting the observable layers, changing a bit the order in which indices are contracted, and changing from a boolean mask to a float mask.
Performance
Timings for 1000 epochs of the main runcard (NNPDF40_nnlo_as_01180_1000), on Snellius, with 100 replicas on the GPU or 1 replica on the CPU. In brackets the GPU memory used.
branch | commit hash | 1 replica | 100 replicas | 500 replicas | 1000 replicas |
---|---|---|---|---|---|
multi-dense + trvl | 59e5b58 | 145 | 320 (16.8 Gb) | x | x |
fk-refactor | 67cd5f0 | 122 | 176 (4.5Gb) | 423 (16.8Gb) | x |
fk-refactor (precompute) | 22ef7b0 | 175 | 100 | ||
fk-refactor (enforce order einsum) | 1a751c2 | 175 | 90 | ||
fk-refactor (fix 1 replica) | 16551f6 | 118 | 90 |
Profile
The validation step will be addressed in #1855 and the gaps in #1802.