Skip to content Skip to sidebar Skip to footer

Numerical Simulations With Multiprocessing Much Slower Than Hoped: Am I Doing Anything Wrong? Can I Speed It Up?

I am running set of numerical simulations. I need to run some sensitivity analyses on the results, i.e. calculate and show how much certain outputs change, as certain inputs vary

Solution 1:

The time differences are for starting the multiple processes up. To start each process it takes some amount of seconds. Actual processing time you are doing much better than non-parallel but part of multiprocessing speed increase is accepting the time it takes to start each process.

In this case, your example functions are relatively fast by amount of seconds so you don't see the time gain immediately on a small amount of records. For more intensive operations on each record you would see much more significant time gains by parallelizing.

Keep in mind that parallelization is both costly, and time-consuming due to the overhead of the subprocesses that is needed by your operating system. Compared to running two or more tasks in a linear way, doing this in parallel you may save between 25 and 30 percent of time per subprocess, depending on your use-case. For example, two tasks that consume 5 seconds each need 10 seconds in total if executed in series, and may need about 8 seconds on average on a multi-core machine when parallelized. 3 of those 8 seconds may be lost to overhead, limiting your speed improvements.

From this article.

Edited:

When using a Pool(), you have a few options to assign tasks to the pool.

multiprocessing.apply_asynch()docs is used to assign a single task and in order to avoid blocking while waiting for that task completion.

multiprocessing.map_asyncdocs will chunk an iterable by chunk_size and add each chunk to the pool to be completed.

In your case, it will depend on the real scenario you are using, but they aren't exchangeable based on time, rather based on what function you need to run. I'm not going to say for sure which one you need since you used a fake example. I'm guessing you could use apply_asynch if you need each function to run and the function is self-contained. If the function can parallel run over an iterable, you would want to map_asynch.

Post a Comment for "Numerical Simulations With Multiprocessing Much Slower Than Hoped: Am I Doing Anything Wrong? Can I Speed It Up?"