Parallel execution inactive for some users

Maciej · January 30, 2018, 4:34pm

We are running Mantid 3.11 on Windows Server 2008 R2, via MantidPlot.
The same data analysis script (inelastic neutron scattering: rebinnning, conversion to MD and making cuts) can take 4 minutes or 25 minutes depending on who is running it. The system monitor shows that, when the script is running slowly, a single CPU is used throughout. For most users, the same Mantid algorithms run in parallel.
Summing up, we use the same binaries, the same Python script, the same computer, and we get the same results in the end. But for some users it runs only on one CPU.

Could you think of a reason why Mantid should behave like this?

jameslord · March 19, 2020, 4:34pm

Sorry to resurrect an old thread, but I seem to have a similar observation. Some of my scripts make much use of numpy routines on large arrays which should be easily spread over as many cores as available. This used to happen on my old Windows 7 PC with earlier Mantids.
Now with Mantid 5 on my desktop Windows 10 PC it’s only using one core and so running much more slowly than necessary. It’s the same with Mantid 4.2 on a Windows 10 laptop, and also running the numpy code stand-alone under a separate Python 3 installation on the desktop PC.
Is there an option I have to use, to re-enable parallel processing?

DanielMurphy22 · March 23, 2020, 11:07am

There isn’t an option related to enabling parallel processing. Some of our processes use it and some as yet do not, such as simple fitting. Do you have an example of when you think Mantid is using one core and could be using more?

Maciej · April 9, 2020, 10:24am

I am not sure if it is going to be of interest, but I did a lot of testing back then, on the same Windows Server 2008 R2, using the official Mantid builds (3.11, 3.13, 4.0). The findings were:

The performance drop was not related to any specific version of Mantid. It seems to have started with a Windows system update. That is, from a certain point in time it was not possible to get back to the original performance by installing an older release of Mantid, that was known to work fine in the past.
I was using a Python-based data loader (LoadEXED), using numpy. For smaller files (< 50 MB), there was no significant change of performance. Above a certain file size there was a saturation effect: I read the same file many times in a row. This is how long it took:
1st time: 4 seconds
2nd time: 20 seconds
3rd time: over 1 minute
4th time: 1 minute 30 seconds
5th time: 1 minute 32 seconds,
every consecutive read: under 1 minute 40 seconds.
From then on, the entire Mantid operation remained slow.
After I loaded a few large files, like in the previous point, the computer in question still had over 40 GB free RAM, no CPU activity, no intense hard disk activity and no significant network traffic. However, the scripts which would normally run fast and in parallel, from that moment would only run on 1 CPU core and take much longer than before. The only way to solve it was to restart Mantid.
I re-wrote the data loader, so that it loaded the file in chunks, using many small numpy arrays. It did NOT solve the problem.
For reference, I ran the same scripts on the same data files using Mantid 4.0 under Ubuntu 18.04. The problem did not occur.
I do not know if anything can be concluded from these observations.