Twisted SSL socket connection slowdown

Question

How do I scale my Twisted server to handle tens of thousands of concurrent SSL socket connections?

The first few hundred clients are connected relatively quickly, but as the count approaches 3000, it begins to crawl at about 2 connections made per second.

I am load testing using the loop below:

clients =  []

for i in xrange(connections):
    print i
    clients.append(
        ssl.wrap_socket(
            socket.socket(socket.AF_INET, socket.SOCK_STREAM),
            ca_certs="server.crt",
            cert_reqs=ssl.CERT_REQUIRED
        )
    )

    clients[i].connect(('localhost', 9999))

cProfile:

         296644049 function calls (296407530 primitive calls) in 3070.656 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001 3070.656 3070.656 server.py:7(<module>)
        1    0.000    0.000 3070.408 3070.408 server.py:148(main)
        1    0.000    0.000 3070.406 3070.406 server.py:106(run)
        1    0.000    0.000 3070.405 3070.405 base.py:1190(run)
        1    0.047    0.047 3070.404 3070.404 base.py:1195(mainLoop)
    34383    0.090    0.000 3070.263    0.089 epollreactor.py:367(doPoll)
    38696    0.064    0.000 3066.883    0.079 log.py:75(callWithLogger)
    38696    0.077    0.000 3066.797    0.079 log.py:70(callWithContext)
    38696    0.035    0.000 3066.598    0.079 context.py:117(callWithContext)
    38696    0.056    0.000 3066.556    0.079 context.py:61(callWithContext)
    38695    0.093    0.000 3066.486    0.079 posixbase.py:572(_doReadOrWrite)
     8599 1249.585    0.145 3019.333    0.351 protocol.py:114(getClientsDict)
 37582010 1681.445    0.000 1681.445    0.000 {method 'items' of 'dict' objects}
    21496    0.114    0.000 1535.798    0.071 tls.py:346(_flushReceiveBIO)
    21496    0.026    0.000 1535.793    0.071 tcp.py:199(doRead)
    21496    0.017    0.000 1535.718    0.071 tcp.py:218(_dataReceived)
    17197    0.033    0.000 1535.701    0.089 tls.py:400(dataReceived)
     8597    0.009    0.000 1531.480    0.178 policies.py:119(dataReceived)
     8597    0.078    0.000 1531.471    0.178 protocol.py:65(dataReceived)
     4300    0.029    0.000 1525.117    0.355 posixbase.py:242(_disconnectSelectable)
     4300    0.030    0.000 1524.922    0.355 tcp.py:283(connectionLost)
     4300    0.024    0.000 1524.659    0.355 tls.py:463(connectionLost)
     4300    0.010    0.000 1524.492    0.355 policies.py:123(connectionLost)
     4300    0.119    0.000 1524.471    0.355 protocol.py:50(connectionLost)
     4299    0.027    0.000 1523.698    0.354 tcp.py:270(readConnectionLost)
     4299    0.135    0.000 1520.228    0.354 protocol.py:88(handleInitialState)
 74840519   31.487    0.000   44.916    0.000 __init__.py:348(__getattr__)

Reactor run code:

def run(self):
    contextFactory = ssl.DefaultOpenSSLContextFactory(self._key, self._cert)
    reactor.listenSSL(self._port, BrakersFactory(), contextFactory)
    reactor.run()

[Related, worth reviewing](http://stackoverflow.com/questions/2350071/maximum-number-of-concurrent-connections-on-a-single-port-socket-of-server) — jedwards, Mar 09 '15 at 20:29
How about providing some of the code that your having the slowdown problem with? (at least snippets that deal with incoming connections). Without background your twisted code, I have no idea what to suggest. BTW your testing code looks like your running it on the same box, if you really want to test to high port counts you'll likely need to move the tests to a second machine — Mike Lutz, Mar 09 '15 at 22:13
I am getting the same result using a remote client on the LAN. I have pasted the top of the cProfile output. I'm not sure where it is bottle-necking or what optimizations I'm missing, but I will try to supply some of the server code. — Christopher Markieta, Mar 09 '15 at 23:28
Please attach the code you are having issues with; without the ability to reproduce the performance results it's hard to know what to suggest. — Glyph, Mar 10 '15 at 00:11
I cannot attach all of the code due to privacy concerns, but I will try to isolate the issue and share my findings. — Christopher Markieta, Mar 10 '15 at 01:14

score 2 · Answer 1 · edited May 23 '17 at 10:28

Given the lack of code in the question, I've toss some together to see if I experience the effect your talking about. And from that experiment, the first thing I would say is check and see what is happening with memory utilization on your machine while your script runs.

I spun up a standard google cloud computing system (1 vCPU, 3.8GB ram) (debian backports wheezy, apt-get update; apt-get install python-twisted) and ran the following (awful hack) code:

(note: to run this I needed to do a ulimit -n 4096 for both the client and server shells or I would start getting 'Too many open file' I.E. Socket accept - "Too many open files")

serv.py

#!/usr/bin/python

from twisted.internet import ssl, reactor
from twisted.internet.protocol import ServerFactory, Protocol

class Echo(Protocol):
    def connectionMade(self):
        self.factory.clients.append(self)
        print "Currently %d open connections.\n" % len(self.factory.clients)

    def connectionLost(self, reason):
        self.factory.clients.remove(self)
        print "Lost connection"

    def dataReceived(self, data):
        """As soon as any data is received, write it back."""
        self.transport.write(data)

class MyServerFactory(ServerFactory):
    protocol = Echo

    def __init__(self):
        self.clients = []



if __name__ == '__main__':
    factory = MyServerFactory()
    reactor.listenSSL(8000, factory,
                      ssl.DefaultOpenSSLContextFactory(
            'keys/server.key', 'keys/server.crt'))
    reactor.run()

cli.py

#!/usr/bin/python

from twisted.internet import ssl, reactor
from twisted.internet.protocol import ClientFactory, Protocol

class EchoClient(Protocol):
    def connectionMade(self):
        print "hello, world"
        # The following delay is there because as soon as the write
        # happens the server will close the connection
        reactor.callLater(60, self.transport.write, "hello, world!")

    def dataReceived(self, data):
        print "Server said:", data
        self.transport.loseConnection()

class EchoClientFactory(ClientFactory):
    protocol = EchoClient

    def __init__(self):
        self.stopping = False

    def clientConnectionFailed(self, connector, reason):
        print "Connection failed - reason ",  reason
        if not self.stopping:
              self.stopping = True
              reactor.callLater(10,reactor.stop)

    def clientConnectionLost(self, connector, reason):
        print "Connection lost - goodbye!"
        if not self.stopping:
              self.stopping = True
              reactor.callLater(10,reactor.stop)

if __name__ == '__main__':
    connections = 4000
    factory = EchoClientFactory()
    for i in xrange(connections):
          # the following could certainly be done more elegantly, but I believe
          # its a legit use, and given the list in finite, shouldn't be too
          # resource intensive of a use... ?
          reactor.callLater(i/float(400), reactor.connectSSL,'xx.xx.xx.xx', 8000, factory, ssl.ClientContextFactory())
    reactor.run()

Upon running, and crossing 2544 connections, my machine seriously jammed up, sufficiently so it was hard to collect data from, but given that new ssh'es where coming back with '/bin/bash: Cannot allocate memory', and when I did get on my serv.py had 2g of res, and the client had 1.4g, I think it's safe to say that I blew ram.

Given the above code was just a fast hack, I likely have outstanding bugs that caused the memory problem - though I thought I would offer the idea, because causing your machine to swap is certainly a good way to cause your app to crawl. (and perhaps you have the same bugs as me)

(BTW for the smarter twisted people out there, I welcome comment I what I'm doing wrong thats burning so much ram)

I don't think it is a problem with memory as I still have 10 GB free. But thanks for this example. Now I know it's not a limitation of Twisted but something blocking in my own code. — Christopher Markieta, Mar 10 '15 at 01:07
Cool, glad the code can help! Did you try this code on your (higher ram) machine? If so how many connections could you get it up to? — Mike Lutz, Mar 10 '15 at 01:53
It gets up to about 23,900 until it starts losing connections. Not sure why. Could it have something to do with the ephemeral port range? — Christopher Markieta, Mar 10 '15 at 01:58

score 1 · Accepted Answer · answered Mar 13 '15 at 21:43

I managed to determine the cause of slowdown in my protocol.

As you can see from the cProfile above, the majority of tottime was spent in the getClientDict() method:

         296644049 function calls (296407530 primitive calls) in 3070.656 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     8599 1249.585    0.145 3019.333    0.351 protocol.py:114(getClientsDict)
 37582010 1681.445    0.000 1681.445    0.000 {method 'items' of 'dict' objects}

The following code was causing this issue:

def getClientsDict(self):
    rc = {1: {}, 2: {}}

    for r in self.factory._clients[1]:
        rc[1] = dict(rc[1].items() +
                                  {r.getDict[1]['id']:
                                       r.getDict[1][
                                           'address']}.items())
    for m in self.factory._clients[2]:
        rc[2] = dict(rc[2].items() +
                                 {m.getDict[2]['id']:
                                      m.getDict[2][
                                          'address']}.items())
    return rc

Twisted SSL socket connection slowdown

2 Answers2