1

I am trying to download an online file in python. I have seen solutions using urllib2 for python 2 and wget. If my purpose is just to download the file, is there any advantage of using urllib2 instead of wget. To me using wget package seems simpler. However, most of the online solutions I see are using urllib2 and urllib with python 3. I am more inclined towards wget as it works for both python 2 and python 3.

This question is different from the question marked as duplicate of this as I have asked for difference with respect to wget, while the other question does not address the relationship with respect to wget package.

Gaurav Srivastava
  • 505
  • 1
  • 7
  • 17
  • `wget` requires you to shell out to an external application. Python programs should always prefer to call Python libraries rather than shelling out. – Jonathon Reinhart Oct 23 '18 at 19:36
  • the `requests` library is great as well, available for Python 2 & 3 – Sam Mason Oct 23 '18 at 19:39
  • [This answer](https://stackoverflow.com/a/17510727/119527) shows exactly how to use `urllib` from Python 2 or 3. – Jonathon Reinhart Oct 23 '18 at 19:41
  • @JonathonReinhart what do you mean by shelling out. Does it mean wget runs some shell command at the back. wget is also a standard python library, right? Then how does one differentiate between python and non-python library? Though, I get wget is not a good option from what I have read by now. – Gaurav Srivastava Oct 23 '18 at 22:55
  • My mistake. I didn't realize this existed: [`wget`](https://pypi.org/project/wget/), the "pure python library". I thought you were referring to [`wget`](http://man7.org/linux/man-pages/man1/wget.1.html), the command-line utility. Regardless, `requests` is much more common. – Jonathon Reinhart Oct 23 '18 at 23:12
  • @wim the link duplicate question mentions nothing about usefulness of urllib against wget, which is my question. Then why the question has been marked duplicate? – Gaurav Srivastava Oct 23 '18 at 23:32
  • I did not mark it as duplicate, other user did. It's a bug in stackexchange that I was listed :( – wim Oct 28 '18 at 19:33

1 Answers1

1

If you use wget then you'd end up writing way much more code when needing to decode the errors that happen with it, than you'd need for a Python library.

However, when it comes to urllib - first of all, are you sure you really need to support Python 2? Python 2 is obsolete.

If you really really believe that you do, then perhaps you can use a compatibility library such as six or the future

And you should consider alternatives too - the requests library is superior to the urllib and provides the same interface for Python 2 and 3.

  • @Anitti After some more reading I am convinced to use requests. However, I am curious about what kinds of errors are you talking about and why it would be difficult to debug with wget when it is present in PyPI and should be up-to-date. If you can give some instance. Also, saw some answers, but only a few of them, suggesting to not use urllib as it is less mature than wget (https://stackoverflow.com/questions/2777116/difference-between-python-urllib-urlretrieve-and-wget). Can you comment on this point. – Gaurav Srivastava Oct 23 '18 at 22:52