The error tells you exactly what went wrong. You don't have permission to write to path D:/Thesis/test_http_dl
.
There are four possible reasons for that:
- You already have a file with that name, which you don't have write access to.
- You don't have access to create new files in
D:\Thesis
.
- You don't have write access to the D: drive at all (e.g., because it's a CD-ROM).
- Some other process has the file open for exclusive access.
You need to look at the ACLs for D:\Thesis\test_http_dl
if it exists, or for D:\Thesis\
otherwise, and see if your user (the one you're running the script as) has write access, and also check whether that path or the D drive itself has the "read-only" flag on, and also check whether any other process has the file open. (I don't know of any built-in tool for that last one, but handle
or Process Explorer
from sysinternals can do it for you easily.)
Meanwhile, none of the stuff with urllib2
is at all relevant here. You can verify that by just doing this:
open("D:/Thesis/test_http_dl", "w")
You will get the exact same exception.
It's worth knowing how to figure that out the "hard" way, for cases where the exception doesn't tell you exactly what's wrong. You get an exception in a line like this:
open("D:/Thesis/test_http_dl", "w").write(ul)
Something is wrong, and if you don't have enough information to tell what it is, what do you do? Well, first, break it into pieces, so each line has exactly one operation:
f = open("D:/Thesis/test_http_dl", "w")
f.write(ul)
Now you know which one of those two gets an exception.
While you're at it, since the only thing this code depends on is ul
, you can create a simpler program to test this:
ul = 'junk'
f = open("D:/Thesis/test_http_dl", "w")
f.write(ul)
Even if that doesn't help you directly, it means you don't need to wait for the download every time through the test loop, and you've got something simpler to post to SO (see SSCCE for more), and this is something you can just type into the interactive interpreter. Instead of trying to guess what might be useful to print out to see why the write
is raising an exception, you can start with help(f)
or dir(f)
and play with it live. (In this case, I'm guessing it's actually the open
that fails, not the write
, but you shouldn't have to guess.)
On to your second problem:
urllib.urlretrieve() just creates a 1 kb file in the folder, which obviously is not the downloaded file.
Actually, I think it is the downloaded file. You're not asking for AF_eMTH_NDVI.2012.047-056.QKM.COMPRES.005.2012059143841.zip
, you're asking for AF_eMTH_NDVI.2012.047-056.QKM.COMPRES.005.2012059143841.zip.sum
, which is probably a checksum file—a quasi-standard type of file that contains metadata that helps you make sure the file you're downloading wasn't damaged in transit or tampered with by a hacker. A typical checksum file has one or more lines, each mapping a downloadable file to a checksum or cryptographic hash digest, in some format, for a downloadable file. Sometimes they have three columns—the type of checksum/hash, the value of the checksum/hash in some stringified format, and the filename or full URL of the file. Sometimes the first column is omitted, and you have to know from elsewhere what type of checksum/hash is being used (often MD5 as a hex string). Sometimes the columns are in different orders. Sometimes they're separated by commas or tabs, or in fixed-width fields, or some other variation.
At any rate, you'd expect a .sum file to be around 80 bytes long. If you look at it in Explorer or the dir
command, it'll usually be rounded up to the nearest 1K. So, you should see a 1K file if you download this successfully.
Meanwhile:
print(repr(ul[:60])) is '\n
You should try printing out the rest of this, because it's probably some kind of document explaining, in human terms, what you're doing wrong. This could be because you need to pass a URL agent, a preferred encoding, a referer, or some other header.
However, I tested the exact same line of code you used repeatedly, and ul
is always:
1ba6437044bfa9259fa2d3da8f95aebd AF_eMTH_NDVI.2012.047-056.QKM.COMPRES.005.2012059143841.zip
In other words, it's a perfectly valid checksum file, not an HTML page. So, I suspect what's really going on is that you aren't testing the same code you're showing us.