I have created a web bot that iterates over a website e.g example.com/?id=int
where int
is some integer. the function gets the result in raw html using requests
library then hands it to parseAndWrite
to extract a div
and save its value in a sqlite db:
def archive(initial_index, final_index):
while True:
try:
for i in range(initial_index, final_index):
res = requests.get('https://www.example.com/?id='+str(i))
parseAndWrite(res.text)
print(i, ' archived')
except requests.exceptions.ConnectionError:
print("[-] Connection lost. ")
continue
except:
exit(1)
break
archive(1, 10000)
My problem is that, after some time, the loop doesn't continue to 10000
but repeats itself from a random value, resulting in many duplicate records in the database. What is causing this inconsistency ?