0

My current professor is using Python 2.7 for examples in class, but other professors that I will be taking classes from in the future have suggested I use Python 3.5. I am trying to convert my current Professor's examples from 2.7 to 3.5. Right now I'm having an issue with the urllib2 package, which I understand has been split in Python 3.

The original code in the iPython notebook looks like this :

import csv
import urllib2

data_url = 'http://archive.ics.uci.edu/ml/machine-learning-    databases/adult/adult.data'
response = urllib2.urlopen(data_url)

myreader = csv.reader(response)
for i in range(5):
    row = next(myreader)
   print ','.join(row)

Which I have converted to:

import csv
import urllib.request

data_url = 'http://archive.ics.uci.edu/ml/machine-learning-  databases/adult/adult.data'
response = urllib.request.urlopen(data_url)
myreader = csv.reader(response)
for i in range(5):
    row = next(myreader)
    print(','.join(row))

But that leaves me with the error:

Error                                     Traceback (most recent call last)
<ipython-input-19-20da479e256f> in <module>()
      7 myreader = csv.reader(response)
      8 for i in range(5):
----> 9     row = next(myreader)
     10     print(','.join(row))

Error: iterator should return strings, not bytes (did you open the file in text mode?)

I'm unsure how to proceed from here. Any ideas?

Mohammad Yusuf
  • 16,554
  • 10
  • 50
  • 78
JMac
  • 1
  • 2
  • Would you consider using [`requests`](http://docs.python-requests.org/en/master/) package? – Mohammad Yusuf Jan 18 '17 at 15:03
  • You have to convert the `bytes` returned by the `urlopen` to `str`. `.decode()` does this. Take a look at: http://stackoverflow.com/questions/6224052/what-is-the-difference-between-a-string-and-a-byte-string. – Ma0 Jan 18 '17 at 15:10
  • 1
    Possible duplicate of [Read .csv file from URL into Python 3.x - \_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)](http://stackoverflow.com/questions/18897029/read-csv-file-from-url-into-python-3-x-csv-error-iterator-should-return-str) – fedorshishi Jan 18 '17 at 15:13

1 Answers1

1

Wrap response with another iterator which decode bytes to string and yield the strings:

import csv
import urllib.request

def decode_iter(it):
    # iterate line by line
    for line in it:
        # convert bytes to string  using `bytes.decode`
        yield line.decode()

data_url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data'
response = urllib.request.urlopen(data_url)
myreader = csv.reader(decode_iter(response))
for i in range(5):
    row = next(myreader)
    print(','.join(row))

UPDATE

Instead of decode_iter, you can use codecs.iter_decode:

import csv
import codecs
import urllib.request

data_url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data'
response = urllib.request.urlopen(data_url)
myreader = csv.reader(codecs.iterdecode(response, 'utf-8'))
for i in range(5):
    row = next(myreader)
    print(','.join(row))
falsetru
  • 357,413
  • 63
  • 732
  • 636