0

I'm using Python's multiprocessing package to set up worker threads to do work and update the result to a MySQL database. What's the right way to set this up so that a database connection is not re-established each time a worker thread is initialized?

ensnare
  • 40,069
  • 64
  • 158
  • 224

1 Answers1

3

Each thread must use its own separate connection.

MySQL protocol is not stateless (like for example http). If you try to use a single MySQL connection among multiple threads, the server gets confused about which request it's responding to, and the client threads get confused because the wrong thread might read a response.

The same is true for any other stateful protocol, like for example ftp.

A better way to reduce overhead is to use a connection pool. Each thread requests a connection in the thread's initialization, and the pool manager assigns the thread exclusive use of one of the connections from the pool, until the thread is done with it. Then it returns the connection to the pool, where it will be allocated to another thread requesting a connection.

Even better is to have threads request a connection not in the thread initialization, but later on, when it actually needs to do some database work. Then release the connection when that db work is done. If it's very low-overhead to request a connection from the pool, there's no reason to hold on to the connection for the life of the thread.

It's better to share!

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828