Python Multiprocessing Module and Closures

At work, I wrote a Python script which uses the multiprocessing module to process many servers in parallel. The code looks something like:

def processServer(server):
    # Do work...

numParallelTasks = ...
servers = [...]
pool = multiprocessing.Pool(processes=numParallelTasks)
results = pool.map(processServer, servers)

I wanted to pass some extra state to processServer without using a global variable. My first attempt was to use a closure, so I wrote the following:

def processServer(extraState):
    def processServerWorker(server):
        # Do work, using extraState as needed
    return processServerWorker

numParallelTasks = ...
servers = [...]
extraState = ...
pool = multiprocessing.Pool(processes=numParallelTasks)
results = pool.map(processServer(extraState), servers)

This failed with the following error:

Exception in thread Thread-2:
Traceback (most recent call last):
  File "C:\Program Files\Python 2.7.1\lib\threading.py", line 530, in __bootstra
p_inner
    self.run()
  File "C:\Program Files\Python 2.7.1\lib\threading.py", line 483, in run
    self.__target(*self.__args, **self.__kwargs)
  File "C:\Program Files\Python 2.7.1\lib\multiprocessing\pool.py", line 285, in
 _handle_tasks
    put(task)
PicklingError: Can't pickle : attribute lookup __builtin__.func
tion failed

The solution was to change processServer from a function closure into a class with a __call__ method as follows:

class processServer:
    def __init__(self, extraState):
        self.extraState = extraState

    def __call__(self, server):
        # Do work, using self.extraState as needed
Advertisements