Get wget to never time out or catch timeout on python

Question

I am using python 2.7 with the wget module. https://pypi.python.org/pypi/wget

The URL to download is not responsive sometimes. It can take ages to download and when this happens, wget times out. How can I get wget to never time out or at least catch the timeout?

The code to download is simple.

wget.download(url_download)

Any reason you can't just catch the timeout error and loop to try again (ideally with a sleep to backoff, so your code can't accidentally DoS the server). — ShadowRanger, Oct 15 '16 at 01:59
Oops. Pardon my ignorance. I didn't know I can do that. How do I catch the timeout? I have edited the question accordingly. — guagay_wk, Oct 15 '16 at 02:00
I'm not actually familiar with the `wget` PyPI package, so I don't know what it does in your timeout scenario. Presumably you've observed it happening; what did it do? Raise an exception? Return silently while leaving the local file empty, or setting a status code? Whatever it is, detect it, try again. — ShadowRanger, Oct 15 '16 at 02:08
http://stackoverflow.com/questions/12624133/wget-with-python-time-limit — Simon, Oct 15 '16 at 02:11
On further checking, it looks like you can pass `download` a callback that is updated with the status as you go. Otherwise, it's mostly a simple wrapper around [`urllib.urlretrieve`](https://docs.python.org/2/library/urllib.html#urllib.urlretrieve), so you only get the exception it raises (it will raise if there is a `Content-Length` header and the data received is shorter for instance). I see no real indication that it will do anything to timeout, which likely means it's just "whatever the socket library decides". `wget` is a really simple package, it's not designed for complex use cases. — ShadowRanger, Oct 15 '16 at 02:15
@Simon: That's for the `wget` command line tool, which shares nothing but a name (and a few superficial display similarities) with the `wget` package, AFAICT. — ShadowRanger, Oct 15 '16 at 02:16
ShadowRanger , seems like wget is not flexible enough for the job. — guagay_wk, Oct 15 '16 at 02:20
The accepted answer is, but, the solution furthest down seems more suitable then wget, I don't know, maybe wget is a good solution. But natively supported modules work better. — Simon, Oct 15 '16 at 02:21
@downshift: Don't encourage use of `shell=True`, particularly if string formatting is involved. It's unnecessarily slow, unsafe, and a potential security hole. You should _always_ use `list` based invocation with the default `shell=False` unless there is a _very_ good reason not to (Hint: You're wrong, 99.999% of the time, there isn't a good reason to do so). `subprocess.call(['wget', '--timeout=0', url_download])` is actually shorter, safer, and faster. — ShadowRanger, Oct 15 '16 at 03:37

score -1 · Answer 1 · answered Oct 15 '16 at 02:02

-1

You could use requests instead:

requests.get(url_download)

if you don't specify a timeout argument it never times out.

answered Oct 15 '16 at 02:02

Ryan Jay

838
8
9

1

this dont save file on disk – Hisham Karam Oct 15 '16 at 02:22

Get wget to never time out or catch timeout on python

1 Answers1