1

I am using urllib.urlopen with Python 2.7 to read csv files located on an external webserver:

# Try & Except statements removed for clarity
import urllib
import csv
url = ...
csv_file = urllib.urlopen(url)
for row in csv.reader(csv_file):
    do_something()

All 100+ files can be read fine, except one that has been updated recently and that returns:

Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?

The file is accessible here. According to my text editor, its mode is Mac (CR), as opposed to Windows (CRLF) for the other files.

I found that based on this thread, python urlopen will handle correctly all formats of newlines. Therefore, the problem is likely to come from somewhere else. I have no clue though. The file opens fine with all my text editors and my speadsheet editors.

Does any one have any idea how to diagnose the problem ?

* EDIT *

The creator of the file informed me by email that I was not the only one to experience such issues. Therefore, he decided to make it again. The code above now works fine again. Unfortunately, using a new file also means that the issue can no longer be reproduced, and the solutions tested properly.

Before closing the question, I want to thank all the stackers who dedicated some of their time to figure out a solution and post it here.

Community
  • 1
  • 1
Mathieu
  • 431
  • 1
  • 6
  • 15
  • 1
    This sounds like an error from the `csv` module, which handles things like delimiters and quoting. The `urllib` module probably works fine, try `for row in csv_file:` instead to confirm. Sounds like your csv file is corrupted, or that you need to configure your `csv` reader to handle the type of quoting you need. – Anders Johansson Jan 19 '13 at 10:50
  • @AndersJohansson: Based on the email I received from the owner of the file, you guessed right; the file was corrupted. As explained above, I didn't have time to test your solution, though. – Mathieu Jan 19 '13 at 16:05

3 Answers3

1

The following code runs without any error:

#!/usr/bin/env python
import csv
import urllib2

r = urllib2.urlopen('http://www.football-data.co.uk/mmz4281/1213/I1.csv')
for row in csv.reader(r):
    print row
jfs
  • 399,953
  • 195
  • 994
  • 1,670
1

It might be a corrupt .csv file? Otherwise, this code runs perfectly.

#!/usr/bin/python

import urllib
import csv

url = "http://www.football-data.co.uk/mmz4281/1213/I1.csv"
csv_file = urllib.urlopen(url)

for row in csv.reader(csv_file):
  print row

Credits to J.F. Sebastian for the .csv file.

Altough, you might want to consider sharing the specific .csv file with us? So we can try to re-create the error.

Community
  • 1
  • 1
Stian Olsen
  • 62
  • 1
  • 1
  • 9
  • As Anders, you were right: the file was corrupted. "Unfortunately" (well...), this corrupted file has been replaced and everything now works fine with the same old code as before. – Mathieu Jan 19 '13 at 16:07
  • Glad you figured it out. Also, you should use urllib2 as Sebastian pointed out earlier. – Stian Olsen Jan 19 '13 at 20:59
0

I was having the same problem with a downloaded csv.

I know the fix would be to use open with 'rU'. But I would rather not have to save the file to disk, just to open back up into a variable. That seems unnecessary.

file = open(filepath,'rU')
mydata = csv.reader(file)

So if someone has a better solution that would be nice. Stackoverflow links that got me this far:

CSV new-line character seen in unquoted field error

Open the file in universal-newline mode using the CSV Django module



I found what I actually wanted with stringIO, or cStringIO, or io:

Using Python, how do I to read/write data in memory like I would with a file?

I ended up getting io working,

import csv
import urllib2
import io
# warning its a 20MB csv
url = 'http://poweredgec.com/latest_poweredge-11g.csv'
urlRead = urllib2.urlopen(url).read()
ramFile = io.open(urlRead, mode='w')
openRamFile = open(ramFile, 'rU')
csvCurrent = csv.reader(openRamFile)
csvTuple = map(tuple, csvCurrent)

print csvTuple
Community
  • 1
  • 1
TaylorSanchez
  • 463
  • 1
  • 5
  • 8