Skip to content

[WIP] n3fit optimization

Created by: scarlehoff

At some point the speed of n3fit went down considerably. At least at some point during the move to TF 2.0 the DIS fits were still under 1 hour, however now they seem to take up to two https://github.com/NNPDF/nnpdf/pull/672#issuecomment-598068098

Not sure what introduced the problem since usually I don't monitor the times that closely but it might be related to how tensorflow is compiled.

In particular, with the settings in this PR (at the time of writting), with conda packages, everything runs in one processor (there are other threads open and they seem to be doing "something", but everything's happening on just one).

The key setting seems to be KMP_BLOCKTIME. Best results are obtained when it is set to 0 (although the recommended is 1). When it is set to 0 only one thread is being used. Terrible results are obtained for values over 10-20 (default is 200!)

With these settings the code seems to be as fast as before... when running in one core. Still work to be done here.

My current guess is that there is something making tensorflow generate and destroy some graph, but just a working theory.

Merge request reports

Loading