11

What are all the exceptions that can be thrown by pd.read_csv()?

In the example below I am capturing some exception types explicitly and using a generic Exception to catch the others, but what are the others exactly?

Reviewing the documentation for pandas read_csv() I can't see a complete list of exceptions thrown.

In a more general case, what is the recommended practice to determine all of the types of exceptions that can be thrown by any call/library?

import pandas as pd

try:
    df = pd.read_csv("myfile.csv")
except FileNotFoundError:
    print("File not found.")
except pd.errors.EmptyDataError:
    print("No data")
except pd.errors.ParserError:
    print("Parse error")
except Exception:
    print("Some other exception")
MattG
  • 5,589
  • 5
  • 36
  • 52
  • 2
    I think it's important to ask yourself why you want to do this and whether it adds value to whatever you're doing. – cs95 Oct 11 '20 at 09:28
  • You can look here: https://github.com/pandas-dev/pandas/blob/master/pandas/io/parsers.py#L549-L717 , but I'm not sure why you would want to to do this. – David Erickson Oct 11 '20 at 09:29
  • 3
    I see some highly ranked answers for other questions such as this one which recommend against capturing generic exceptions: https://stackoverflow.com/a/9824050/833960. They seem to imply that I should try to determine what exceptions could be thrown, I guess that is what I am asking. – MattG Oct 11 '20 at 09:44
  • My use case – a pipeline, where you can import CSV file, select columns, modify values, and create visualisations, export it to another format like xlsx. Let's say a user will send an excel file but with `.csv` extension. I would like to check if this is an eligible file to process considering content, encodings, separators, etc. Without the knowledge about the error, I would be forced to show a user very generic message. – pdaawr Jun 01 '21 at 10:14
  • @DavidErickson you should use `Copy permalink` when linking to files from VCS repositories. The link is dead now. Here is the current link that will show the `to_csv` version at the time of posting unless the repo is cleaned: https://github.com/pandas-dev/pandas/blob/00af20a22c6c64ad2d8f48345beaede0e946d630/pandas/io/formats/format.py#L1056 – int_ua Jun 21 '21 at 15:36

4 Answers4

3

You can see all the exceptions in the following file:

Python > 3.8 > lib > python > site-packages > pandas > errors > __init__.py

BTW, the exceptions are:

  • IntCastingNaNError
  • NullFrequencyError
  • PerformanceWarning
  • UnsupportedFunctionCall
  • ParserError
  • DtypeWarning
  • EmptyDataError
  • ParserWarning
  • MergeError
  • AccessorRegistrationWarning
  • AbstractMethodError
yesidays
  • 61
  • 3
1

A complete list and explanation can be found in the pandas source code

Desi Pilla
  • 544
  • 6
  • 20
1

If you import pandas and use the dir function you can view all exceptions. Exceptions are contained in the pandas.errors submodule.

In [1]: import pandas as pd

In [2]: ([e for e in dir(pd.errors) if "__" not in e])
Out[2]: 
['AbstractMethodError',
'AccessorRegistrationWarning',
'DtypeWarning',
'DuplicateLabelError',
'EmptyDataError',
'IntCastingNaNError',
'InvalidIndexError',
'MergeError',
'NullFrequencyError',
'NumbaUtilError',
'OptionError',
'OutOfBoundsDatetime',
'OutOfBoundsTimedelta',
'ParserError',
'ParserWarning',
'PerformanceWarning',
'UnsortedIndexError',
'UnsupportedFunctionCall']

You can use these for exception handling. As an example, see below

import pandas as pd
import datetime

bad_date = datetime.date(22, 10, 4)

try:
    pd.to_datetime(bad_date)
except pd.errors.OutOfBoundsDatetime as e:
    # do some error handling here 
    # or raise the error
    print("there was a bad date")
    raise e
Jon
  • 2,373
  • 1
  • 26
  • 34
-8

This is a way to catch all exceptions:

import sys

try:
    int("test") # creates a ValueError
except BaseException as e:
    print('The exception: {}'.format(e))

If you really want to find out the possible exceptions of read_csv you can look at the source code

Rick
  • 308
  • 3
  • 8