Notes on computations distribution

Notes on computations distribution#

Parameters#

The integrate-slurm command will distribute the azimithal integration of datasets over multiple machines.

This is configured through the [computations distribution] section of the create a configuration file.

There are three parameters:

[computations distribution]
n_workers = 4
ai_engines_per_worker = 8
cores_per_worker = 4
  • n_workers is the number of SLURM jobs submitted. Each SLURM submission (i.e worker) is roughly equivalent to half a machine. Each “worker” uses exactly one GPU. Therefore if you have n_workers = 4, it roughly corresponds to using two full compute cluster machines.

  • ai_engines_per_worker is the number of pyFAI processes launched for each worker. This correponds to the number of processes per GPU.

  • cores_per_worker is the number of CPU threads to use for each pyFAI engine (i.e number of threads per process). This is primarily used for LZ4 decompression of data files.

The default is n_workers = 4, ai_engines_per_worker = 8, cores_per_worker = 4, which is sensible for most applications. If you want faster application, the primary parameter to increase is n_workers.

How the work is distributed#

At ESRF, the layout of data files is the following:

{basename}/{proposal}/{beamline}/{subfolders}/{sample}/{dataset}/{scan}/{files}

The HDF5 files to consider (the ones that are provided in [dataset] location in the configuration file) are HDF5 Bliss scans, which are alongside scanXXXX folders. Each of these scanXXXX folders will have one or many files (usually detectorname_YYYY.h5).

For example:

/data/id15/inhouse4/ihma109/id15/ACL9011001b/ACL9011001b_0037/scan0001/pilatus_????.h5
                                  sample        dataset        scan      files (125 of 250 images each)

From the bottom-up:

  • Each pyFAI engine will process one or several Lima files. One engine is an OS process, spawning cores_per_worker threads for LZ4 decompression.

  • There are ai_engines_per_worker processes (pyFAI engines) per SLURM job. Each SLURM job will bind to exactly one GPU.

  • There are n_workers SLURM jobs