I'm downloading about 800 files of trading data using the requests library in Python. The file names of interest have a pattern: "icecleared_power_YYYY_mm_dd.dat" -- but some are empty or don't exist (but are downloaded anyway). My question is: how can I ignore those files that are / would be below a certain size?
My current code downloads all files and at the end deletes those that surely have no content:
for d in dates:
file_name: str = 'icecleared_power_' + str(d.date()).replace('-', '_') + '.dat'
url: str = 'https://downloads.theice.com/Settlement_Reports_CSV/Power/' + file_name
resp = requests.get(url, auth=('username', 'password'))
temp = open('Data/Futures/ICE/' + file_name, 'wb')
temp.write(resp.content)
temp.close()
[os.remove(x) for x in glob(path + '*.dat') if os.path.getsize(x) < 100 * 1024]