0

New to Python, mostly used R before. I am trying to download multiple files from a webpages API (climatic data), see link https://opendata-download-grid-archive.smhi.se/data/6/201604/MESAN_201604050000+000H00M

I want to download a file for each hour of every day of every month of every year (from 2008- 2019), that is changing the "201604050000" part to "201604050100", "201604050200", "201604050300", etc. Everything else looks the same, I only need to change the time/day/month/year to download a file.

This is my Python code for getting every file, but I feel lost in how to download it.

for a in [2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019]:
    for b in [1,2,3,4,5,6,7,8,9,10,11,12]:
        for c in [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,
                    28,29,30,31]:
            for d in ["00", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11",
                      "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23"]:
                s = "grib_%d_%d_%d_%s" %(a,b,c,d) 
                print(s)

How can I download all these files? I understand that it will fill my memory quite, so I am happy if I can start by downloading 24 files to start, which is one day. Anyone else has experienced this problem?

In the end, when I have managed my files, I also want to delete them (so I don't allocate too much memory).

paula456
  • 101
  • 9
  • To start, see https://stackoverflow.com/questions/22676/how-do-i-download-a-file-over-http-using-python – AcK Oct 14 '20 at 12:52
  • I would start by looking at the [requests](https://requests.readthedocs.io/en/master/) library. – Jose A. García Oct 14 '20 at 12:55
  • Thanks for your help. It was still to difficult for me to check the https://stackoverflow.com/questions/22676/how-do-i-download-a-file-over-http-using-python address. Managed to use the request package to download a single file, but not with multiple files. – paula456 Oct 16 '20 at 08:13

1 Answers1

2

Not a full answer, but some suggestions:

Use nicer variable names

Use range() instead of rather long constant lists (where applicable)

for year in range(2010, 2019 + 1):
    for month in range(1, 12 + 1):
        for day in range(1, 31 + 1):  # beware, not every month has 31 days
            for hour in ["00", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11",
                      "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23"]:
                s = "grib_%d_%d_%d_%s" %(year, month, day, hour) 
                print(s)
AcK
  • 2,063
  • 2
  • 20
  • 27