5

I'm trying to run the following code but for some reason I get the following error: "TypeError: limit must be an integer".

Reading csv data file

import sys
import csv

maxInt = sys.maxsize
decrement = True

while decrement:
    decrement = False
    try:
        **csv.field_size_limit(maxInt)**
    except OverflowError:
        maxInt = int(maxInt/10)
        decrement = True

with open("Data.csv", 'rb') as textfile:
    text = csv.reader(textfile, delimiter=" ", quotechar='|')
    for line in text:
        print ' '.join(line)

The error occurs in the starred line. I have only added the extra bit above the csv read statement as the file was too large to read normally. Alternatively, I could change the file to a text file from csv but I'm not sure whether this will corrupt the data further I can't actually see any of the data as the file is >2GB and hence costly to open.

Any ideas? I'm fairly new to Python but I'd really like to learn a lot more.

BenMorel
  • 34,448
  • 50
  • 182
  • 322
Black
  • 4,483
  • 8
  • 38
  • 55
  • Cannot reproduce. Code compiles and returns no error here. – lejlot Sep 14 '13 at 06:42
  • I'm not having a problem calling `csv.field_size_limit(sys.maxsize)` – Greg Sep 14 '13 at 06:43
  • What operating system does this happen on? – Mark Roberts Sep 14 '13 at 07:00
  • Windows 7. Using a Python IDE called 'Enthought Canopy' which has a bunch of packages included with it. Currently, the csv.field_size_limit(maxInt) flags the limit error referred to previously, even when I run it by itself. Is there a way to check that the current install is correct? – Black Sep 14 '13 at 07:40

2 Answers2

4

I'm not sure whether this qualifies as an answer or not, but here are a few things:

First, the csv reader automatically buffers per line of the CSV, so the file size shouldn't matter too much, 2KB or 2GB, whatever.

What might matter is the number of columns or amount of data inside the fields themselves. If this CSV contains War and Peace in each column, then yeah, you're going to have an issue reading it.

Some ways to potentially debug are to run print sys.maxsize, and to just open up a python interpreter, import sys, csv and then run csv.field_size_limit(sys.maxsize). If you are getting some terribly small number or an exception, you may have a bad install of Python. Otherwise, try to take a simpler version of your file. Maybe the first line, or the first several lines and just 1 column. See if you can reproduce the smallest possible case and remove the variability of your system and the file size.

Jordan
  • 31,971
  • 6
  • 56
  • 67
  • 1
    +1. I'd say this qualifies as a pretty good answer since the bug is un-reproducible. – Ayush Sep 14 '13 at 06:53
  • Hi Jordan, thank you for the help. I get an exception for the following code. import csv csv.field_size_limit(sys.maxsize) Other than a corrupt install, could there be another reason as to why the above is returning the same type error? – Black Sep 14 '13 at 07:29
  • @Blackholify, what is your exception, and on which line, and what is the value of your sys.maxsize? – Jordan Sep 15 '13 at 20:43
0

On Windows 7 64bit with Python 2.6, maxInt = sys.maxsize returns 9223372036854775807L which consequently results in a TypeError: limit must be an integer when calling csv.field_size_limit(maxInt). Interestingly, using maxInt = int(sys.maxsize) does not change this. A crude workaround is to simlpy use csv.field_size_limit(2147483647) which of course cause issues on other platforms. In my case this was adquat to identify the broken value in the CSV, fix the export options in the other application and remove the need for csv.field_size_limit().

-- originally posted by user roskakori on this related question

Community
  • 1
  • 1
MrMcPlad
  • 280
  • 3
  • 7