I wrote a program that scrapes information from a given list of websites (100 links) to visit. Currently, my program does this sequentially; that is, checks them one at a time. The skeleton of my program is as follows.
for j in range(len(num_of_links)):
try: #if error occurs, this jumps to next of the list of website
site_exist(j) #a function to check if site exists
get_url_with_info(j) #a function to get links inside the website
except Exception as e:
print(str(e))
filter_result_info(links_with_info) #function that filters result
Needless to say, this process is very slow. Thus, is it possible to implement threading such that my program can handle the job faster such that 4 concurrent jobs scrape the list of links 25 each. Can you point a reference on how I could do this?