I have a Data Pipeline which pulls data using Python script and push it to a Database server. The script is scheduled to run every 4 hours. The script breaks after about 5 intervals of 4 hours with Error/Exception requests.exceptions.ConnectionError: HTTPSConnectionPool(host='****', port=443): Max retries exceeded with url
How do I resolve it?
Following is the relevant code snippet:
def getdata(): #This function gets the Data in form of csv from the website and converts it to dataframe
url = ("*****")
urlData = requests.get(url, verify=False).content
df = pd.read_csv(io.StringIO(urlData.decode('utf-8')))
push_data(df)
schedule.every(4).hours.do(getdata) # A scheduler which runs the script every 4 hours
while True:
# Checks whether a scheduled task
# is pending to run or not
schedule.run_pending()
time.sleep(10)