0

_csv.Error: field larger than field limit (131072) is not fixing my problem.

I have a script that processes CSV files into Excel reports. The script worked just fine until some particular CSV file became quite large (currently > 12 MB).

The script is usually running on Windows 7 64 bit since the team is using Windows clients. Python versions range from 3.6 to 3.7.2 - all 64 bit. All versions produce the error.

The error I'm getting first place is

_csv.Error: field larger than field limit (131072)

Which, using the search function, seems to be easy to fix. But when I include

csv.field_size_limit(sys.maxsize)

it makes it only worse:

Traceback (most recent call last):
  File "CSV-to-Excel.py", line 123, in <module>
    report = process_csv_report(infile)
  File "CSV-to-Excel.py", line 30, in process_csv_report
    csv.field_size_limit(sys.maxsize)
OverflowError: Python int too large to convert to C long

According to my research that bug should have long since be fixed.

My current workaround is using Linux where the code simply works fine. However the team that should be running the script can't run Linux but is locked on Windows.

The code of the script is

#!c:\python37\python.exe

import csv
import sys


def process_csv_report(CSV_report_file):
    files = []
    files.append(CSV_report_file+"_low.csv")
    files.append(CSV_report_file+"_med.csv")
    files.append(CSV_report_file+"_high.csv")
    first = True
    try:
        report = []
        for f in files:
            if first == True:
                with open(f, "r", newline='', encoding='utf-8') as csvfile:
                    original = csv.reader(csvfile, delimiter=',', quotechar='"')
                    for row in original:
                        report.append(row)
                first = False
            else:
                with open(f, "r", newline='', encoding='utf-8') as csvfile:
                    original = csv.reader(csvfile, delimiter=',', quotechar='"')
                    # for the second and third file skip the header line
                    next(original, None)
                    for row in original:
                        report.append(row)
    except Exception as e:
        print("File I/O error! File: {}; Error: {}".format(f, str(e)))
        exit(1)
    return report


if __name__ == "__main__":
    report = process_csv_report(infile)

As simple as it seems I am lost at fixing the issue since the solution working for others fails here for no reason I can see.

Has anybody seens this happen lately with a late version of Python?

Michael
  • 11
  • 5
  • https://stackoverflow.com/questions/15063936/csv-error-field-larger-than-field-limit-131072 solution does not work, as described here. – Michael Feb 04 '19 at 11:29
  • Have you tried **all** answers in that question. Some contributors suggest it's because you have quotes in your text. They also suggest ways to workaround the `OverflowError`. – Alastair McCormack Feb 04 '19 at 11:32
  • 2
    Just added an answer: https://stackoverflow.com/questions/15063936/csv-error-field-larger-than-field-limit-131072/54517228#54517228. – CristiFati Feb 04 '19 at 13:34
  • I voted to close as a duplicate, because also the answer that is provided here is already included in the original question/answers – user1251007 Apr 03 '19 at 18:40

1 Answers1

2

You could replace sys.maxsize by the c integer max value, which is 2147483647.

I know the sys.maxsize should take care of it, but I think using a value inferior to that roof, like 1.000.000 should resolve your issue.

A nicer way to do it could be min(sys.maxsize, 2147483646)

The _csv library is a compiled extension, then it uses the c variables.

olinox14
  • 6,177
  • 2
  • 22
  • 39