Parallelization in Python example with joblib
It can be ridiculously easy to parallelize code in Python. Check out the following simple example:
import time from joblib import Parallel, delayed # A function that can be called to do work: def work(arg): print "Function receives the arguments as a list:", arg # Split the list to individual variables: i, j = arg # All this work function does is wait 1 second... time.sleep(1) # ... and prints a string containing the inputs: print "%s_%s" % (i, j) return "%s_%s" % (i, j) # List of arguments to pass to work(): arg_instances = [(1, 1), (1, 2), (1, 3), (1, 4)] # Anything returned by work() can be stored: results = Parallel(n_jobs=4, verbose=1, backend="threading")(map(delayed(work), arg_instances)) print results
Output:
Function receives the arguments as a list: (1, 1) 1_1 Function receives the arguments as a list: (1, 2) 1_2 Function receives the arguments as a list: (1, 3) 1_3 Function receives the arguments as a list: (1, 4) 1_4 ['1_1', '1_2', '1_3', '1_4'] [Parallel(n_jobs=1)]: Done 4 out of 4 | elapsed: 3.9s finished
As you can see, this simple program executed all four of our argument instances sequentially, because we chose n_jobs = 1, i.e. we told it to use 1 CPU, which means it runs in series. The total time to run is reported as approximately 4 sec (it is actually less than 4 sec, but we won’t concern ourselves with this here!).
Now, we run it again in parallel, but this time with n_jobs = 4:
Function receives the arguments as a list:Function receives the arguments as a list: Function receives the arguments as a list: Function receives the arguments as a list: (1, 1)(1, 2) (1, 4) (1, 3) 1_1 1_2 1_41_3 ['1_1', '1_2', '1_3', '1_4'] [Parallel(n_jobs=4)]: Done 4 out of 4 | elapsed: 0.9s finished
As you can see, the internal print commands from all four jobs are being printed to screen more-or-less simultaneously, and not in the original order. Whatever thread finished first gets printed first! The time to finish is now around 1/4 the original time, as expected.