I have seen multiple issues in the past about using IPython in multiple processes, but I'm not sure that this exact issue has been addressed. Perhaps I'm doing something fundamentally wrong, in which case it would also be great to know!
I'm starting a controller on a machine using:
ipcontroller --ip=10.21.21.188 --port=1024
Then connecting a couple of engines on different machines using:
They are operating under an NFS share, so the home directories are shared (hence the engines can find the JSON settings from the default profile). The connections seem to succeed.
I get the following output:
There are 2 engines.
Assertion failed: ok (mailbox.cpp:84)
..followed by this stacktrace (after the process has already spat me back to bash):
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
File "test_multiproc.py", line 9, in run
my_client = Client()
File "/usr/lib/python2.7/dist-packages/IPython/parallel/client/client.py", line 387, in __init__
self._connect(sshserver, ssh_kwargs, timeout)
File "/usr/lib/python2.7/dist-packages/IPython/parallel/client/client.py", line 491, in _connect
raise error.TimeoutError("Hub connection request timed out")
TimeoutError: Hub connection request timed out
What am I doing wrong? Note that there are no problems with the connection if I do everything in sub-processes, or everything in the main process (as in, I can execute code on the engines from that script).
Some other details: I'm running Ubuntu 12.04 (precise) amd64, Python 2.7.3, IPython 0.12.1, 0MQ 2.2.0 (also tried 2.1.11).
libzmq contexts/sockets are not fork-safe, so you mustn't use them
across the fork implied by multiproccessing.Process.
What is happening is that the zmq objects created as a part of your
initial Client (the one you use for:
print "There are %d engines." % len(client.ids)
are being passed across the fork, and cleaned up by garbage collection
in the subprocess which crashes. I believe this has been fixed in
pyzmq master, but you can work around it by making sure there are no
zmq objects alive in the parent process when you fork, either by
calling client.close() prior to calling proc.start(), or simply not
creating the Client in the parent in the first place, e.g.:
print "There are %d engines." % len(parallel.Client())
Yes that helps. However, is it enough to set the 0MQ context the way I described, or will there be other hiccups we might encounter? At this stage we require the use of Client in the main process as well as subprocesses (something that we could change in the future).
On Mon, Jul 23, 2012 at 4:51 PM, mgi <[hidden email]> wrote:
> Yes that helps. However, is it enough to set the 0MQ context the way I
> described, or will there be other hiccups we might encounter?
I don't think it is, but if it works for you, that's fine.
> At this stage
> we require the use of Client in the main process as well as subprocesses
> (something that we could change in the future).
If you require using the Client in the main process, I would recommend
that you use pyzmq master, and set your minimum required version to
2.2.0-1 (not yet released), which should hopefully handle this.