8

Let's say that i have a way to send http request to a server. How it's possible to send two of these requests (or more) to the server at the same time? For example maybe by fork a process? How can i do it? (also i'm using django)

#This example is not tested...
import requests

def tester(request):
    server_url = 'http://localhost:9000/receive'

    payload = {
        'd_test2': '1234',
        'd_test2': 'demo',
        }
    json_payload = simplejson.dumps(payload)
    content_length = len(json_payload)

    headers = {'Content-Type': 'application/json', 'Content-Length': content_length}
    response = requests.post(server_url, data=json_payload, headers=headers, allow_redirects=True)

    if response.status_code == requests.codes.ok:
        print 'Headers: {}\nResponse: {}'.format(response.headers, response.text)

Thanks!

CodeArtist
  • 5,534
  • 8
  • 40
  • 65

2 Answers2

17

I think you want to use threads here rather than forking off new processes. While threads are bad in some cases, that isn't true here. Also, I think you want to use concurrent.futures instead of using threads (or processes) directly.

For example, let's say you have 10 URLs, and you're currently doing them one in a row, like this:

results = map(tester, urls)

But now, you want to send them 2 at a time. Just change it to this:

with concurrent.futures.ThreadPoolExecutor(max_workers=2) as pool:
    results = pool.map(tester, urls)

If you want to try 4 at a time instead of 2, just change the max_workers. In fact, you should probably experiment with different values to see what works best for your program.

If you want to do something a little fancier, see the documentation—the main ThreadPoolExecutor Example is almost exactly what you're looking for.

Unfortunately, in 2.7, this module doesn't come with the standard library, so you will have to install the backport from PyPI.

If you have pip installed, this should be as simple as:

pip install futures

… or maybe sudo pip install futures, on Unix.

And if you don't have pip, go get it first (follow the link above).


The main reason you sometimes want to use processes instead of threads is that you've got heavy CPU-bound computation, and you want to take advantage of multiple CPU cores. In Python, threading can't effectively use up all your cores. So, if the Task Manager/Activity Monitor/whatever shows that your program is using up 100% CPU on one core, while the others are all at 0%, processes are the answer. With futures, all you have to do is change ThreadPoolExecutor to ProcessPoolExecutor.


Meanwhile, sometimes you need more than just "give me a magic pool of workers to run my tasks". Sometimes you want to run a handful of very long jobs instead of a bunch of little ones, or load-balance the jobs yourself, or pass data between jobs, or whatever. For that, you want to use multiprocessing or threading instead of futures.

Very rarely, even that is too high-level, and directly tell Python to create a new child process or thread. For that, you go all the way down to os.fork (on Unix only) or thread.

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • Thanks your answer is very interesting but as long as i'am not so familiar with mutliprocessing/threads i need to ask you if you mentioning at your post 5 different things. Right? Unfortunately i'm using 2.7 so i'll have to use futures. – CodeArtist Apr 02 '13 at 00:12
  • @JorgeCode: I'm not sure exactly what you meant by "mentioning at your post 5 different things". But I'll try to edit the answer to make it clearer. – abarnert Apr 02 '13 at 00:26
  • I mean `os.fork`, `multiprocessing`, `threads`, `concurrent.futures` and `futures`. – CodeArtist Apr 02 '13 at 00:28
  • @JorgeCode: I see. You specifically asked about fork, so I assumed you knew what that meant. I've rewritten the answer to give you my suggested answer first, and then talk about alternatives later. That should be a lot easier to understand than the original version. Sorry for any confusion. – abarnert Apr 02 '13 at 01:01
  • Thanks for your answer. – Lead Developer Sep 15 '20 at 23:04
3

I would use gevent, which can launch these all in so-called green-threads:

# This will make requests compatible
from gevent import monkey; monkey.patch_all()
import requests

# Make a pool of greenlets to make your requests
from gevent.pool import Pool
p = Pool(10)

urls = [..., ..., ...]
p.map(requests.get, urls)

Of course, this example submits gets, but pool is generalized to map inputs into any function, including, say, yours to make requests. These greenlets will run as nearly as simultaneously as using fork but are much faster and much lighter-weight.

Dan Lecocq
  • 3,383
  • 25
  • 22
  • If you really want to use `gevent` plus `requests`, you're probably better off using [`grequests`](https://github.com/kennethreitz/grequests) than doing it yourself. But really, when you're only doing a handful of things concurrently (the OP asked about 2…), there's no advantage to using greenlets. And the disadvantage is that if you ever add any CPU-bound code, your whole system slams to a halt, and you have to start over and rewrite things in a different way. – abarnert Apr 01 '13 at 22:44
  • Fair point about `grequests`. I have to imagine that any CPU-bound code in the above example is on the server end, in which case I'm not sure how I see how using `gevent` is detrimental. – Dan Lecocq Apr 01 '13 at 22:46
  • It's not really detrimental, it just adds something extra to learn, and makes it harder to adapt with in the (unlikely, but not impossible) event that you need to add CPU-bound client code, and requires something outside the stdlib—all of which are pretty minor negatives, but why incur even a few minor negatives if there's no advantage? – abarnert Apr 01 '13 at 22:49
  • I relate most to your point about including things outside of the standard libraries. While in this case there's no reason for me to be, I am often prejudiced against python threads :-) – Dan Lecocq Apr 01 '13 at 22:51
  • Well, Python threads are almost as bad as greenlets for doing CPU-parallelism, and almost as bad as processes for doing thousands of concurrent I/O-bound tasks, so often they're the wrong choice. But sometimes they're the right choice, and avoiding them for no reason is silly. (I know, you called it "prejudiced", so you already understand this; I'm just explaining this for the benefit of other readers.) – abarnert Apr 01 '13 at 23:03