Parallelize Tree Creation With Dask
I need help about a problem that I'm pretty sure dask can solve. But I don't know how to tackle it. I need to construct a tree recursively. For each node if a criterion is met a co
Solution 1:
As I understand, the way you tackle recursion (or any dynamic computation) is to create tasks within a task.
I was experimenting with something similar, so below is my 5 minute illustrative solution. You'd have to optimise it according to characteristics of the algorithm.
Keep in mind that tasks add overhead, so you'd want to chunk the computations for optimal results.
Relevant doc:
Api reference:
- https://distributed.dask.org/en/latest/api.html#distributed.worker_client
- https://distributed.dask.org/en/latest/api.html#distributed.Client.gather
- https://distributed.dask.org/en/latest/api.html#distributed.Client.submit
import numpy as np
import time
from dask.distributed import Client, worker_client
# Create a dask client
# For convenience, I'm creating a localcluster.
client = Client(threads_per_worker=1, n_workers=8)
client
class Node:
def __init__(self, level):
self.level = level
self.val = None
self.childs = None # This was missing
def merge(node, childs):
values = [child.val for child in childs]
if all(values) and sum(values)<0.1:
node.val = np.mean(values)
else:
node.childs = childs
return node
def compute_val():
time.sleep(0.1) # Is this required.
return np.random.rand(1)
def build(node):
print(node.level)
if (np.random.rand(1) < 0.1 and node.level>1) or node.level>5:
node.val = compute_val()
else:
with worker_client() as client:
child_futures = [client.submit(build, Node(level=node.level+1)) for _ in range(2)]
childs = client.gather(child_futures)
node = merge(node, childs)
return node
tree_future = client.submit(build, Node(level=0))
tree = tree_future.result()
Post a Comment for "Parallelize Tree Creation With Dask"