Just to add from bashrc, you can also use it with requests.
You don't need to use urllib.request method.
it would be something like :
from concurrent import futures
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
with futures.ThreadPoolExecutor(max_workers=5) as executor: ## you can increase the amount of workers, it would increase the amount of thread created
res = executor.map(requests.get,URLS)
responses = list(res) ## the future is returning a generator. You may want to turn it to list.
What I like to do however, it is to create a function that returns directly the json from the response (or the text if you want to scrape).
And use that function in the threadpool
import requests
from concurrent import futures
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
def getData(url):
res = requests.get(url)
try:
return res.json()
except:
return res.text
with futures.ThreadPoolExecutor(max_workers=5) as executor:
res = executor.map(getData,URLS)
responses = list(res) ## your list will already be pre-formated