python program call with tornado post request block session till end of the python program

Question

this is a program input multiple urls calling url localhost:8888/api/v1/crawler

this program taking 1+hour to run its ok but it block other apis. when it running other any api will not work till the existing api end so i want to run this program asynchronously so how can i achieve with the same program

@tornado.web.asynchronous
    @gen.coroutine
    @use_args(OrgTypeSchema)
    def post(self, args):
        print "Enter In Crawler Match Script POST"
        print "Argsssss........"
        print args
        data = tornado.escape.json_decode(self.request.body)
        print "Data................"
        import json
        print json.dumps(data.get('urls'))
        from urllib import urlopen
        from bs4 import BeautifulSoup
        try:
                urls = json.dumps(data.get('urls'));
                urls  = urls.split()

                import sys

                list = [];

                # orig_stdout = sys.stdout
                # f = open('out.txt', 'w')
                # sys.stdout = f
                for url in urls:
                    # print "FOFOFOFOFFOFO"
                    # print url
                    url = url.replace('"'," ")
                    url = url.replace('[', " ")

                    url = url.replace(']', " ")
                    url = url.replace(',', " ")
                    print "Final Url "
                    print url
                    try:
                        site = urlopen(url) ..............

score 0 · Answer 1 · answered Nov 08 '17 at 09:39

0

Your post method is 100% synchronous. You should make the site = urlopen(url) async. There is an async HTTP client in Tornado for that. Also good example here.

answered Nov 08 '17 at 09:39

Fine

2,114
1
12
18

sir, still blocking other api call when it running Program is – rohit Nov 08 '17 at 10:51
If it's still blocking could you please add your new code with the use of async http client? – Fine Nov 08 '17 at 12:02

score 0 · Answer 2 · answered Nov 08 '17 at 13:26

You are using urllib which is the reason for blocking.

Tornado provides a non-blocking client called AsyncHTTPClient, which is what you should be using.

Use it like this:

from tornado.httpclient import AsyncHTTPClient

@gen.coroutine
@use_args(OrgTypeSchema)
def post(self, args):
    ...
    http_client = AsyncHTTPClient()
    site = yield http_client.fetch(url)
    ...

Another thing that I'd like to point out is don't import modules from inside a function. Although, it's not the reason for blocking but it is still slower than if you put all your imports at the top of file. Read this question.

python program call with tornado post request block session till end of the python program

2 Answers2