n3fit OOM in conda env
Created by: siranipour
I've been trying to run a 1 rep DIS only fit using the new code. I use the given PN3_DIS_example.yml
runcard and execute with n3fit PN3_DIS_example.yml 1 -o NNPDF31_nnlo_as_0118_DISonly_NEWCODE
.
The code works, but quickly ramps up in memory usage before OOMing and linux begins to SWAP.
I have been talking to Juan about this and he reports the same issue, provided the installation is done through conda. I quote this useful snippet from our email exchange
two issues that do not appear outside the conda environment: 1 - The memory grows in a point in which there should be no more memory growth (the model has already been formed, the memory usage from that point onwards should be minimal) 2 - It generates a stupid amount of threads after the parallelization has already occurred. My guess is these threads are the ones generating the out-of-memory because they are not generated in the non-conda version.
A probable culprit is a bugged library in the conda package. Would be useful to take a look at this. Cheers.