As a quick follow up to my previous post, here’s a look at the performance of passing messages between two python processes using the Queue class vs using 0mq push / pull connections. As a quick test, we will pass 10 million messages between two processes, first using Queue, then using 0mq.
Multiprocessing test with Queue
import sys import time from multiprocessing import Process, Queue def worker(q): for task_nbr in range(10000000): message = q.get() sys.exit(1) def main(): send_q = Queue() Process(target=worker, args=(send_q,)).start() for num in range(10000000): send_q.put("MESSAGE") if __name__ == "__main__": start_time = time.time() main() end_time = time.time() duration = end_time - start_time msg_per_sec = 10000000 / duration print "Duration: %s" % duration print "Messages Per Second: %s" % msg_per_sec
Multiprocessing test with 0mq
import sys import zmq from multiprocessing import Process import time def worker(): context = zmq.Context() work_receiver = context.socket(zmq.PULL) work_receiver.connect("tcp://127.0.0.1:5557") for task_nbr in range(10000000): message = work_receiver.recv() sys.exit(1) def main(): Process(target=worker, args=()).start() context = zmq.Context() ventilator_send = context.socket(zmq.PUSH) ventilator_send.bind("tcp://127.0.0.1:5557") for num in range(10000000): ventilator_send.send("MESSAGE") if __name__ == "__main__": start_time = time.time() main() end_time = time.time() duration = end_time - start_time msg_per_sec = 10000000 / duration print "Duration: %s" % duration print "Messages Per Second: %s" % msg_per_sec
Queue Results
python2 ./multiproc_with_queue.py Duration: 164.182257891 Messages Per Second: 60907.9210414
0mq Results
python2 ./multiproc_with_zeromq.py Duration: 23.3490710258 Messages Per Second: 428282.563744
The numbers speak for themselves.
To be fair you should use a serializer on the zmq side as well. Python multiprocessing.Queues pickle all objects automatically.
I would say firstly that the test is already fair. I was testing message passing, and not python object passing. However, in the case that one might wish to pass serialized objects, pyzmq using send_pyobj() and recv_pyobj() still outperforms Queue:
Another issue is load balancing when there are multiple workers. If you use zmq like this then the messages get allocated to workers in a round robin fashion. Since some tasks may take longer than others, workers may be left idle while one still has lots of tasks left to do.
I also find that if I start adding tasks to the queue without first waiting for all the workers to connect then all tasks can get allocated to the first worker which manages to connect! (I guess that is why you need time.sleep(1) at line 10 of your ventilator example.)
multiprocessing.Queue does not have these issues because tasks are not allocated till a worker requests a task. (And if all you want is to send messages between two processes, and you don’t mind that sending might block, then you can use multiprocessing.Pipe which is faster.)
Absolutely true. If you have tasks that take varying amounts of time to perform, then slow workers will seriously gum up the works. The task sink pattern can be good for a narrow range of use cases, but for many cases zmq REP and REQ sockets are much more appropriate. Workers request a task, execute the task, and request their next task when they’re ready.
I downloaded the sample code and repeated the benchmark with a more complex message:
doc = {
‘something’: “More”,
‘another’: “thing”,
‘what?’: range(200),
‘ok’: [‘asdf’, ‘asdf’, ‘asdf’]
}
The results were very different (this is 1M iterations):
python queue:
Duration: 18.5572431087
Messages Per Second: 53887.3147342
zmq (simplejson encode):
Duration: 88.1423079967
Messages Per Second: 11345.2894839
zmq with send_pyobj/recv_pyobj:
Duration: 49.649574995
Messages Per Second: 20141.1593171
The performance is much worse then python Queue with a more realistic message.
I used python 2.6.6, and zmq 2.2.0
Nice. It’s been awhile since I’ve touched my blog – I’ve been doing quite a bit with zeromq since I first wrote these articles and both I and zeromq have grown quite a bit. Apologies for posting your comment so late, I’ve neglected my blog for far too long!
Brian
use multiprocessing Pipes, then you’ll see python results are 3X faster.
Queues is a high level abstraction written on top of pieps.