Notes on computations distribution#
Parameters#
The integrate-slurm
command will distribute the azimithal integration of datasets over multiple machines.
This is configured through the [computations distribution]
section of the create a configuration file.
There are three parameters:
[computations distribution]
n_workers = 4
ai_engines_per_worker = 8
cores_per_worker = 4
n_workers
is the number of SLURM jobs submitted. Each SLURM submission (i.e worker) is roughly equivalent to half a machine. Each “worker” uses exactly one GPU. Therefore if you haven_workers = 4
, it roughly corresponds to using two full compute cluster machines.ai_engines_per_worker
is the number of pyFAI processes launched for each worker. This correponds to the number of processes per GPU.cores_per_worker
is the number of CPU threads to use for each pyFAI engine (i.e number of threads per process). This is primarily used for LZ4 decompression of data files.
The default is n_workers = 4
, ai_engines_per_worker = 8
, cores_per_worker = 4
, which is sensible for most applications.
If you want faster application, the primary parameter to increase is n_workers
.
How the work is distributed#
At ESRF, the layout of data files is the following:
{basename}/{proposal}/{beamline}/{subfolders}/{sample}/{dataset}/{scan}/{files}
The HDF5 files to consider (the ones that are provided in [dataset] location
in the configuration file) are HDF5 Bliss scans, which are alongside scanXXXX
folders.
Each of these scanXXXX
folders will have one or many files (usually detectorname_YYYY.h5
).
For example:
/data/id15/inhouse4/ihma109/id15/ACL9011001b/ACL9011001b_0037/scan0001/pilatus_????.h5
sample dataset scan files (125 of 250 images each)
From the bottom-up:
Each pyFAI engine will process one or several Lima files. One engine is an OS process, spawning
cores_per_worker
threads for LZ4 decompression.There are
ai_engines_per_worker
processes (pyFAI engines) per SLURM job. Each SLURM job will bind to exactly one GPU.There are
n_workers
SLURM jobs