How do I download and read the CSV data on this website using Python:
Asked
Active
Viewed 2.3k times
4
-
2http://stackoverflow.com/questions/419260/grabbing-text-from-a-webpage – dsgriffin Mar 21 '13 at 00:19
-
Isn't it simple CSV? What have you tried? – shinkou Mar 21 '13 at 00:19
-
2read about the [urllib2](http://docs.python.org/2/library/urllib2.html?highlight=urllib2#urllib2) module for downloading the page and the [csv](http://docs.python.org/2/library/csv.html) module for parsing the data. – isedev Mar 21 '13 at 00:20
-
3Since you're on Python 3.x, make that the [urllib.request](http://docs.python.org/3/library/urllib.request.html) module and the [csv](http://docs.python.org/3/library/csv.html) module. – abarnert Mar 21 '13 at 00:37
1 Answers
18
It depends on what you want to do with the data. If you simply want to download the data you can use urllib2.
import urllib2
downloaded_data = urllib2.urlopen('http://...')
for line in downloaded_data.readlines():
print line
If you need to parse the csv you can use the urrlib2 and csv modules.
Python 2.X
import csv
import urllib2
downloaded_data = urllib2.urlopen('http://...')
csv_data = csv.reader(downloaded_data)
for row in csv_data:
print row
Python 3.X
import csv
import urllib.request
downloaded_data = urllib.request.urlopen('http://...')
csv_data = csv.reader(downloaded_data)
for row in csv_data:
print(row)

eandersson
- 25,781
- 8
- 89
- 110
-
3
-
-
1and you should really use urllib2 ;) -- and I know the feeling, I can't help myself either sometimes... – isedev Mar 21 '13 at 00:23
-
I get an error saying that urllib2 doesn't exist as a module which is really strange – user2117875 Mar 21 '13 at 00:27
-
Hah. Yea, bad tendancy when writting code directly in my browser. :p – eandersson Mar 21 '13 at 00:29
-
3...and parse the date then bring it up on google maps?! Actually, that would be kinda cool. – tdelaney Mar 21 '13 at 00:29
-
1@user2117875 Try replacing `urllib2` with `urllib`, but I would recommend that you install the `urllib2` library, but that is a different topic. – eandersson Mar 21 '13 at 00:30
-
that's exactly the plan @tdelaney. As I said urlib2 is giving me trouble for some reason i get an import error – user2117875 Mar 21 '13 at 00:31
-
-
2I suspect the problem is that the OP is on Python 3, not that he's on Python 1. So just replacing `urllib2` with `urllib` will not solve the problem. You need to use the new renamed version in `urllib.request`. – abarnert Mar 21 '13 at 00:35
-
Take a look at this answer http://stackoverflow.com/questions/3969726/attributeerror-module-object-has-no-attribute-urlopen and then make sure to read http://stackoverflow.com/about – eandersson Mar 21 '13 at 00:35
-
1@user2117875: Meanwhile, if you had followed the link to `urllib2`, or searched for the docs yourself, you would have seen the big note at the top saying "The urllib2 module has been split across several modules in Python 3 named urllib.request and urllib.error" and figured this out, rather than making Fuji guess at your problem. – abarnert Mar 21 '13 at 00:36
-
Yep, like @abarnert said. Make sure to read the http://stackoverflow.com/about and http://stackoverflow.com/faq pages. And no worries. – eandersson Mar 21 '13 at 00:38
-
it reads in the data ok from the website but it has issues with this line: for row in csv_data:. The error is 'the iterator should return strings not bytes', i've tried it with DictReader too, it gives the same error – user2117875 Mar 21 '13 at 00:52
-