Python.using queues with Threads

Question

I was told data threads can easily be combined with queues,but I have encountered problems. This code should create a program that will serially, or one after the other, grab a URL of a website, and print out the first 512 bytes of the page.

from queue import Queue
from threading import Thread
import urllib.request

hosts = ["http://yahoo.com", "http://google.com", "http://amazon.com","http://ibm.com", "http://apple.com"]

queue = Queue()

class ThreadUrl(Thread):
   def __init__(self, queue):
       Thread.__init__(self)
       self.queue = queue

   def run(self):
      while True:
         host = self.queue.get()
         url=urllib.request.urlopen(host)
         print(url.read(512))
         self.queue.task_done()

def main():
    for i in range(5):
        t = ThreadUrl(queue)
        t.setDaemon(True)
        t.start()

    for host in hosts:
        queue.put(host)

    queue.join()

main()

I got this,problem at thee last thread

b'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="sr"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image"><title>Google</title><script>(function(){window.google={kEI:\'hD3FWZiRJ8G2a8GfqdAF\',kEXPI:\'18168,1352613,1352960,1353383,1353747,1354276,1354401,1354625,1354749,1354875,1355174,1355205,1355217,3700315,3700476,4017608,4029815,4031109,4043492,4045841,4048347,4061945,'
b'\n<!DOCTYPE html>\n<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US" prefix="og: http://ogp.me/ns#" class="no-js">\n\n<head>\n\t\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n<meta charset="utf-8" />\n<link rel="canonical" href="https://www.apple.com/" />\n\n\n\t\n\t<link rel="alternate" href="https://www.apple.com/" hreflang="en-US" /><link rel="alternate" href="https://www.apple.com/ae-ar/" hreflang="ar-AE" /><link rel="alternate" href="https://www.apple.com/ae/" hreflang="en-AE" /><link rel="alternate" href="https://'
b'<!DOCTYPE html>\n<html id="atomic" lang="en-US" class="atomic my3columns  l-out Pos-r https fp fp-v2 rc1 fp-default mini-uh-on viewer-right two-col ntk-wide ltr desktop Desktop bkt201">\n<head>\n    \n    <title>Yahoo</title><meta http-equiv="x-dns-prefetch-control" content="on"><link rel="dns-prefetch" href="//s.yimg.com"><link rel="preconnect" href="//s.yimg.com"><link rel="dns-prefetch" href="//search.yahoo.com"><link rel="preconnect" href="//search.yahoo.com"><link rel="dns-prefetch" href="//y.analytics.yah'
b'<!DOCTYPE html>\n<html lang="en-US">\n<head>\n\t<meta charset="UTF-8">\n\t<meta name="viewport" content="width=device-width, initial-scale=1">\n\t<title>IBM - United States</title>\n\t<link rel="canonical" href="https://www.ibm.com/us-en/"/>\n\t<meta name="robots" content="index,follow">\n\t<meta name="description" content="For more than a century IBM has been dedicated to every client&#x27;s success and to creating innovations that matter for the world">\n\t<meta name="keywords" content="IBM">\n\t<meta name="dcterms.date" c'
Exception in thread Thread-5:
Traceback (most recent call last):
  File "/home/milenko/anaconda3/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "f1.py", line 17, in run
    url=urllib.request.urlopen(host)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 532, in open
    response = meth(req, response)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 564, in error
    result = self._call_chain(*args)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 756, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 532, in open
    response = meth(req, response)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 570, in error
    return self._call_chain(*args)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 503: Service Unavailable

Why?

Because server has returned code `503` which was thrown as an exception you don't handle in your thread. This problem has nothing to do with threads. — myaut, Sep 22 '17 at 16:51

score 2 · Accepted Answer · answered Sep 22 '17 at 16:52

Like the error says, you're getting a HTTP error; It has nothing to do with threads. The URL you're calling is returning a 503 Service Unavailable error response.

503 SERVICE UNAVAILABLE The server is currently unable to handle the request due to a temporary overload or scheduled maintenance, which will likely be alleviated after some delay.

The server MAY send a Retry-After header field1 to suggest an appropriate amount of time for the client to wait before retrying the request.

Note: The existence of the 503 status code does not imply that a server has to use it when becoming overloaded. Some servers might simply refuse the connection.

Most likely, you're hammering the URL too quickly and you exceeded the throttle limit they have. You can confirm this by checking the response to see if it has a Retry-After header. The message body may also explain what the throttle limit is.

The solution is to slow down your requests to the service. Read their documentation and find out what their throttle limits are, then update your code to stay within those limits.

If I get rid of the last one,than I got Exception in thread Thread-1. — MishaVacic, Sep 22 '17 at 16:56
You are still probably making requests too fast to each of the hosts and they're throttling you. Make sure your requests are going slowly enough that you're not hammering their services. — Soviut, Sep 22 '17 at 22:16

Python.using queues with Threads

1 Answers1