3

I need to write an app that can receive, process, and then send events at ~15k Events Per Second (EPS). I've been learning twisted and have been using it to benchmark some tests:

Twisted RX only = ~90K EPS
Twisted RX and TX = ~45K EPS (basically half of RX only)
Twisted RX, processing, and TX = ~6K EPS (close, but not 15K EPS)

The processing portion is mostly a single regex matches condition - a task that is CPU-bound. I tried using threads.deferToThread and callbacks but, as expected, didn't improve the CPU-bound processing.

My server has 256 cores and I'd love to be able to put them to use while using twisted. Can I wrap multiprocessing in with twisted? Each process needs to share a dict, so I'd have to use multiprocessing.Manager().

Can multiprocessing be done within twisted? Is there a faster way run CPU-heavy tasks (regex expressions) in parallel within twisted?

Community
  • 1
  • 1
Jared
  • 607
  • 1
  • 6
  • 28
  • 1
    See this question: http://stackoverflow.com/questions/5715217/mix-python-twisted-with-multiprocessing. The accepted answer is from the founder of twisted, so he's a fairly the authoritative source. – dano May 09 '14 at 00:26
  • related: "[Powerhose](http://powerhose.readthedocs.org/en/latest/) turns your CPU-bound tasks into I/O-bound tasks so your Python applications are easier to scale." (I like the quote: it applies to twisted too) – jfs May 09 '14 at 02:03
  • [`regex` module (drop in `re` replacement)](https://pypi.python.org/pypi/regex/) can release GIL. You could try it with `deferToThread` and see whether it improves things. – jfs May 09 '14 at 02:05
  • @J.F.Sebastian - good idea with the `regex` replacement - my results shows that it is actually slower than standard `re`. 3484 EPS using `regex` and concurrent=True, 6414 EPS using `re`. Thanks for the suggestion though, I'll play with it a little more to see if I squeeze any more efficiency out of my regular expression. – Jared May 09 '14 at 14:05

1 Answers1

3

Like the commenter, I would point at Glyph's answer for the multiprocessing question.

With that you could spawn off a fleet of blocking regex matching processes and communicate with them via the childFDs IProcessProtocol.childDataReceived and IProcessTransport.writeToChild methods.

This would let your twisted reactor continue to run at full speed and should get you a lot closer to your non-processing numbers (minus the cpu time for managing the extra file descriptors (though that should be tiny as compared letting the regex block the reactor))

Community
  • 1
  • 1
Mike Lutz
  • 1,812
  • 1
  • 10
  • 17