2

I'm building a REST API in Python using Python 2.6, Flask on 64 Bit Linux.

I've been asked to determine how the GIL could affect the performance of this service. What if something happened that caused the interpreter to lock for a few seconds? Intuitively it would hinder the performance but I need to be able to demonstrate the impact.

The specific concern is, if somebody introduces code (e.g. a C extension) that causes a large amount of additional locking, could it make the whole API completely useless?

What I'd like is something like time.sleep() which basically locks up the interpreter for a period of time. I could build a model API in which queries trigger locks of various lengths and then demonstrate the extent to which concurrency is reduced as a function of the amount of time spent in the lock.

Alejandro
  • 3,040
  • 1
  • 21
  • 30
Salim Fadhley
  • 6,975
  • 14
  • 46
  • 83
  • The concern is valid, but if you're allowing people to introduce C code you're pretty much at their mercy anyway. A problematic C extension could also contain a bug that crashes (or worse, silently corrupts) the process, which would also render your API useless. The only real thing to that can be done, I think, is to only allow C code to run if you trust it to be problem-free (and that is true with or without a GIL). – Jeremy Friesner Jun 17 '15 at 21:56
  • I work for an organization that has a huge amount of legacy C code. It's of a reasonably high quality (no more buggy than my Python code) - but suppose the feature we needed was in C. Do I have the potential to use it? If so what's the cost to concurrency in my API. It's not acceptable to reject it because it *might* contain bugs - I need to give the managers real data bout what the likely impact is even if it doesn't contain bugs. – Salim Fadhley Jun 17 '15 at 21:59
  • good question. You could make your own C extension that sleeps. Or otherwise perform a long loop (i.e. busy waiting). – Pynchia Jun 17 '15 at 22:03
  • Fair enough, but the likely impact is what you would expect: when a lock is held for a long period, any other thread that wants to acquire the lock will be held off until the lock is released. The magnitude of the performance impact will depend on how long the C extension holds the lock. – Jeremy Friesner Jun 17 '15 at 22:05
  • BTW: I am ignorant on Flask. Does it spawn threads or processes? The GIL affects the former, not the latter ([see here](http://stackoverflow.com/questions/992136/does-running-separate-python-processes-avoid-the-gil)) – Pynchia Jun 17 '15 at 22:06
  • Not a C developer - Is there something off the shelf that might have the right kind of properites? For example what if I just churned a big matrix in numpy? Exact timing is not important - relative timing is. For example if I made the amount of external code twice as complicated. – Salim Fadhley Jun 17 '15 at 22:06
  • Flask is threaded by default. The application requires a large amount of shared memory (specifically a big table, possibly implemented in something like numpy), hence my concern about C extensions. – Salim Fadhley Jun 17 '15 at 22:06
  • Couldn't you mitigate the potential risks by load balancing across multiple processes (*Use an Async I/O web framework*) and even across multiple nodes/hosts? – James Mills Jun 17 '15 at 22:13
  • @James Mills - we'd use a load-balancer anyway however it dosen't really answer the question of how GIL locking stuff would affect the server. It might be comofrting to know (later) that some good caching and load balancing can reduce the scale of the problem. – Salim Fadhley Jun 17 '15 at 22:19
  • Well as @Jeremy Friesner said; you have *bigger* problems if you allow C extensions into your system than the GIL locking the interpreter. You *should* design for scalability anyway and not worry about what happens to *one* process/thread IHMO. – James Mills Jun 17 '15 at 22:22

0 Answers0