Using dask.distributed
for distributing tomography reconstruction computations at ESRF
dask.distributed
dask.distributed
A tomography scan acquires a collection of images : radios
In parallel-beam geometry, line k of each radios is used to build sinogram k
Radios are chunked, so that each "compute worker" will handle a series of sinograms
Worker k handles chunk k
dask.distributed
Project management reasons:
dask.distributed
Technical reasons:
distributed
: resilience, smart scheduling, dashboard, Actors/RPC, ...dask_jobqueue
offers an abstraction on top of SLURM/OAR/SGE... ESRF uses two batch schedulers !dask.distributed
We do not use dask itself, but dask.distributed
and dask_jobqueue
.
The new tomography processing software can do the computations in two modes:
dask.distributed
In both cases, the process is the same. A "master process" will
Benefits
submit()
returns a future
) and callbacksLimitations
submit()
(workaround next slide)SpecCluster
instead of LocalCluster
. But not dask_jobqueue
equivalent (?)(client, cluster)
couples, and one "meta-client" ?In the context of (off-line) tomography reconstruction, the critical resource is GPU memory (more than computing power).
The standard way of using distributed
is to submit state-less jobs. However for some reason cuda context are not properly handled between Nanny
and Worker
:
push()
and pop()
the Cuda contextsSolutions:
Actor
sHow to have a "full-featured" Actor
:
1 for worker_name, worker_conf in self.workers.items():
2 fut = self.client.submit(
3 WorkerProcess,
4 ...
5 actor=True
6 )
7 self.actors[worker_name] = fut.result()
8
9
10 for task in self.tasks:
11 f = self.client.submit(
12 actor_process_chunk,
13 *task
14 )
15 self.futures.append(f)
16
17
18 def actor_process_chunk(sub_region, chunk_id):
19 myself = get_worker()
20 my_actors = list(myself.actors.keys())
21 chunk_processor = myself.actors[my_actors[0]]
22 chunk_processor.process_chunk(sub_region=sub_region)
Table of Contents | t |
---|---|
Exposé | ESC |
Full screen slides | e |
Presenter View | p |
Source Files | s |
Slide Numbers | n |
Toggle screen blanking | b |
Show/hide slide context | c |
Notes | 2 |
Help | h |