A really simple Multiprocessing Python Example

Purpose and introduction

A Python program will not be able to take advantage of more than one core or more than one CPU by default.  One way to get the program to take advantage of multiple cores is through the multiprocessing module.  There are lots of excellent references and tutorials available on the web (links are included at the bottom), but one thing I was not able to find back when I first started using multiprocessing was a detailed look at an extremely simple, but still practical, example.  Sometimes, it is useful when dealing with a new technique to see it in a very simple form, but not so simple as some of the completely contrived examples in the library documentation.

So, here is a look at an extremely simple example using an embarrassingly parallel issue: generating the Mandelbrot set.  The algorithm used is basically a direct adaptation of the one presented in pseudo-code on Wikipedia, grouping the pixels into rows to make it easier to pass off to the multiprocessing.  Just to be clear, this is far from the fastest or best or most elegant way to use Python to calculate the Mandelbrot set.  It does provide a fairly good springboard for using multiprocessing while still doing actual work.

Essentially this provides a straightforward example with explanations of processing a function against a list of arguments using multiprocessing and then gathering those together into a list.

A look at a single processor version

Here is the code for a single processor version:

import matplotlib.pyplot as plt
from functools import partial

def mandelbrotCalcRow(yPos, h, w, max_iteration = 1000):
    y0 = yPos * (2/float(h)) - 1 #rescale to -1 to 1
    row = []
    for xPos in range(w):
        x0 = xPos * (3.5/float(w)) - 2.5 #rescale to -2.5 to 1
        iteration, z = 0, 0 + 0j
        c = complex(x0, y0)
        while abs(z) < 2 and iteration < max_iteration:
            z = z**2 + c
            iteration += 1

    return row

def mandelbrotCalcSet(h, w, max_iteration = 1000):
    partialCalcRow = partial(mandelbrotCalcRow, h=h, w=w, max_iteration = max_iteration)
    mandelImg = map(partialCalcRow, xrange(h))
    return mandelImg

mandelImg = mandelbrotCalcSet(400, 400, 1000)

The modifications needed to use multiprocessing

Obviously, to use multiprocessing, we need to import it, so towards the top, we add:

import multiprocessing

The mandelbrotCalcRow function can remain unchanged.  The main changes are to the mandelbrotCalcSet function, which now looks like:

def mandelbrotCalcSet(h, w, max_iteration = 1000):
    #make a helper function that better supports pool.map by using only 1 var
    #This is necessary since the version
    partialCalcRow = partial(mandelbrotCalcRow, h=h, w=w, max_iteration = max_iteration)

    pool =multiprocessing.Pool() #creates a pool of process, controls worksers
    #the pool.map only accepts one iterable, so use the partial function
    #so that we only need to deal with one variable.
    mandelImg = pool.map(partialCalcRow, xrange(h)) #make our results with a map call
    pool.close() #we are not adding any more processes
    pool.join() #tell it to wait until all threads are done before going on

    return mandelImg

Here, Pool creates the pool of processes that controls the workers.  It gets the environment ready to run multiple tasks.  One of the easiest ways to use the pool is to use its map.  That takes a function and an iterable of parameters.  That function is then called for each parameter in the iterable and results are put into a list, distributing the calls over the available threads.

One significant difference between pool.map and the built-in map, other than the fact pool.map can take advantage of multiple processors, is that pool.map will only take a single iterable of arguments for processing.  That is why I created a partial function which freezes the other arguments.

Pool.close() then informs the processor that no new tasks will be added that pool.  Either pool.close or pool.terminate need to be called before pool.join can be called.  Pool.join stops and waits for all of the results to be finished and collected before proceeding with the rest of the program.  This gives a simple way to collect the results into a single list for use later.

The other significant change is that the main portion, the entry-point of the script, needs to be wrapped with a  “if __name__=’__main__’ conditional on Windows.  This is because the main module needs to be able to be safely imported by a new python interpreter.  Not doing this can result in problems such as a RuntimeError or completely locking up the system in some of the tests I tried.  This, and a couple of other caveats, are mentioned in the Programming Guidelines.

So, the entry point now looks like:

if __name__=='__main__':
    mandelImg = mandelbrotCalcSet(400, 400, 1000)

In this example, the multiprocessing version only has 8 additional lines of code (its 15 lines longer, but 7 of those lines are additional whitespace or comment lines I added to make it easier to read).  But it runs in less than a third of the time.

Of course, it is worth remembering the saying that “premature optimization is the root of all evil.”  It is normally smart to get the code working first, and then consider bringing in multiprocessing options.

And the results:

Some related links.

  1. Multiprocessing Docs
  2. The examples in the documentation.
  3. Wiki.cython.org has an example of creating the Mandelbrot set using Cython.  For actually generating the set rather than just making examples for multiprocessing, that version is much better.
  4. SciPy.org has a good discussion of parallel programming with numpy and scipy.


{Edit 10 Jan 13 – Corrected a minor spelling error.}
{Edit 20 May 14 – Corrected typos.


11 thoughts on “A really simple Multiprocessing Python Example

  1. I’ve been looking around for simple coding patterns in Python for multiprocessing and the search led me to this blog article. While your example code is certainly simple, and even reflects my interest in Mandelbrot calculations, I’m still looking for something slightly different and wondering if you have seen anything like this.

    I want to create a network daemon that listens for new incoming connections. For each connection that arrives, I want to run a specific handler for that connection (a worker process). This will need to be sure the connection resources (and underlying system socket) are closed in the parent process, and everything else is closed in the child process. In the C language this is a cumbersome chore. I’m hoping the multiprocessing package has a means to handle this.

    Once the child/worker is running, I have no further need to communicate or share with it. The parent should resume listening for more connections. So calling join() would break the purpose, which is to allow many processes for many connections, without the time taken by one to delay any others. Fortunately for my case these are not computational, so I could run many worker processes per CPU core.

    But I would like keep track of how many are running so the parent can stop listening if the number of workers is at the maximum (for example 100 workers might be the maximum). This would mean the multiprocessing package would be handling the child process exits somehow behind the scenes. The parent would just wait until the number of children get below 100 then resume the listen loop.

    I can write this in C. But I want to do this in Python so the worker logic can be implemented in Python.

    Any ideas about this? BTW, the worker logic is going to do a lot of DB accesses and output some summary results as a very simple HTML page.

    • I’m afraid I haven’t seen an example like what you are describing, but it does not seem to be that difficult. As you comment, you probably do not want to join the threads as that will cause them to be somewhat dependent on completion of the other joined threads. It works well for tightly related threads on a parallelizable and especially on an embarrassingly parallel algorithm.

      You want to spawn more independent threads that do not need much interaction. If you write the code to handle the connections after they come in as a separate program you could look at the subprocess module to manage them. If you want them to be more conventionally part of your main program you may want to look at threading which permits more control over independent threads or event he lower level thread module instead of using multiprocessing.

      • Thanks for the reply. I’m focusing on multiprocessing, but multithreading may be possible. The latter shares memory and I don’t know what that means in the context of Python. If I did this in C, I’d still prefer multiprocessing as that keeps things separate. It looks as if the Python packages are trying to keep the API semantics alike as much as possible, so that is good.

        Yes, these tasks are completely independent once started. And processes are easier to clean up (done by the OS at exit). But the multiprocessing package seems more focused on all the communication and sharing support that I don’t need.

        I am thinking, if socket passing works well in this, of having a child do the listen loop and pass everything to the parent, which then makes new child workers as needed. But even this runs into a classic programming issue of how to wait for 2 things at the same time when each provides its own function call to wait for one thing.

  2. Thank you very much, this was extremely helpful. All other documentation/examples I have found for multiprocessing were nowhere near as useful as yours.

    Two minor typos in your code: “Import multiprocessing” should be “import multiprocessing” (lower case “i”) and you didn’t tab over “return mandelImg” in your second version of “mandelbrotCalcSet”.

  3. Greetings
    I would appreciate if you could let me know how to deal with my problem when using joblib to do parallel computing. In fact, I tried to run the following code on windows 10 using pycharm (or Spyder on anaconda with python 3.5).


    Therefore, I used the following command in order to do parallel computing on windows:

    if __name__ == ‘__main__’:

    However, it will result in this error:

    NameError: name ‘num_cores’ is not defined

    Best regards,

  4. something is looking for a python object (variable) named ‘num_cores’. does this only happen in windows 10? sorry i know nothing of joblib or windows. your case is out of my range of knowledge.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s