1

I'm building a high performance web framework using Bjoern as the WSGI server.

I'm now wondering, if you need to handle say, 200.000 req/s, how would you scale/spread Bjoern to multiple servers, or rather load-balance as it's commonly called? What would be your preferred approach?

Is there some helper/builtin in Bjoern to aid in doing that? Or should I employ a separate load balancer in Python?

For example let's take the following simple WSGI server:

import bjoern, os

urls = {'/':"hello world", '/greet':"hi user!"}

def application(environ, start_response):

   response_body = urls[environ['PATH_INFO']]

   status = '200 OK'
   response_headers = [('Content-Type', 'text/plain'),
              ('Content-Length', str(len(response_body)))]
   start_response(status, response_headers)

   return [response_body]

bjoern.listen(app, "localhost", 8000)
bjoern.run()

To scale it to multiple processors it needs to be modified like in the following way:

import bjoern, os

# We use two workers in this case, meaning 2 processors.
NUM_WORKERS = 2
worker_pids = []

urls = {'/':"hello world", '/greet':"hi user!"}

def application(environ, start_response):

   response_body = urls[environ['PATH_INFO']]

   status = '200 OK'
   response_headers = [('Content-Type', 'text/plain'),
              ('Content-Length', str(len(response_body)))]
   start_response(status, response_headers)

   return [response_body]

bjoern.listen(app, "localhost", 8000)
for _ in xrange(NUM_WORKERS):
   pid = os.fork()
      if pid > 0:
        worker_pids.append(pid)
      elif pid == 0:
        try:
           bjoern.run()
        except KeyboardInterrupt:
           pass
        exit()

try:
   for _ in xrange(NUM_WORKERS):
      os.wait()
except KeyboardInterrupt:
   for pid in worker_pids:
      os.kill(pid, signal.SIGINT)

But what if I want to scale it to multiple servers instead (thus using their resources as well)?

Employing other web servers such as Nginx, Lighttp or Monkey-http seems overkill, especially in a context where the project's philosophy strives in keeping everything compact and without unnecessary fluff.

Bashar
  • 13
  • 4
  • Python Director is more than 10 years old. Don't think it's actively maintained. – Daniel Aug 17 '14 at 08:14
  • @Daniel You're right, I saw that now (**Copyright (c) 2002-2003**). Removing it from the question then. – Bashar Aug 17 '14 at 08:17
  • First, look what's your bottleneck. With a software solution, every request produces twice the traffic, into the load-balancer to the server and the same way back. Have you two separate network connections? Maybe a simple round robin DNS solution is possible. – Daniel Aug 17 '14 at 08:34
  • 1
    @Daniel The question assumes that every bottleneck is properly ironed out :) Yes, potentially every server has its any ip, making it suitable for *round robin DNS* as you say. I'm reading this article in that regard [Load Balancing With Round Robin DNS](http://www.atrixnet.com/load-balancing-with-round-robin-dns/). – Bashar Aug 17 '14 at 09:11
  • 1
    I can see that *round robin DNS* has the issue of caching, so there is a possibility that a huge load is incidentally put to any of the servers. – Bashar Aug 17 '14 at 15:44

0 Answers0