3

I am using ubuntu 18.04, pandas==1.2.1

my excel file looks something like this

seq           userid      point     .....
2.01e^+12       A        231231.15
2.012e^+12      B          123
2.0131e^+12     C           3
2.41e^+12       D         2312
2.41e^+12       E         31.15

max(seq) = 2.41e^+12 max(point) = 231231.15

When I to pd.read_excel("file_name.xlsx") it outputs error message in the title.

From resources:

  1. OverflowError: Python int too large to convert to C long torchtext.datasets.text_classification.DATASETS['AG_NEWS']() -> tells me I need to change csv.field_size_limit to sys.maxsize however couldn't find out how to change excel.field_size_limit

  2. "OverflowError: Python int too large to convert to C long" on windows but not mac -> tells me I need to set seq, point columns to float datatype. which i did using pd.read_excel("file_name.xlsx", converters={'seq':float, "point":float}) however didn't fix.

When I remove two float columns I can read excel file. How can I fix this error?

haneulkim
  • 4,406
  • 9
  • 38
  • 80

1 Answers1

0

If you check the options for importing data on pydata.org you'll options like convert_dates=True you can toggle these on or off until your file reads. Assuming the problematic data is a date.