0

I am writing a py to import in a large amount of files, manipulate them and then output to .csv. Which is cake in Pandas, however I have no control over the files coming in so I am trying to write the script to have an exception on how to handle if files come in the "wrong" way.

Anyway, I am using a Try/Except to show the user that there is a KeyError in one of the files (basicially there is a " in a cell when the datatype is int).

My question is: Is there a way to have the except: bring back the file name of the file that caused the error??

for csv in csvList:
        df = pd.read_csv(csv, header=0, skip_blank_lines=True, skipinitialspace=True)\
            .dropna(how='all')

        try:
            df[0] = df[0].astype(int)
            df[1] = df[1].astype(int)
            df[2] = df[2].astype(int)
            df[3] = df[3].astype(int)

            report_path = 'UPC_Ready_for_Import'
            if not os.path.exists(report_path):
                os.makedirs(report_path)

            df.to_csv(os.path.join(report_path, csv + '_import.csv'), index=False)

        except KeyError:
            print('Error within file, please review files')
Sam
  • 227
  • 1
  • 5
  • 14
  • Share the code please. Don't leave it all to the imagination of the others.. – vahdet Jul 02 '18 at 11:44
  • Portion shared, not sure how much that will help. – Sam Jul 02 '18 at 11:46
  • 3
    Possible duplicate of [Extract traceback info from an exception object](https://stackoverflow.com/questions/11414894/extract-traceback-info-from-an-exception-object) – ElmoVanKielmo Jul 02 '18 at 11:46
  • 1
    Your `csv` variable should hold the filename. – hellow Jul 02 '18 at 11:48
  • 3
    Why not use a print statement with the file name? `csv` I assume... Or store it to a .txt file or whatever... When an exception is raised, the all block `except` is played. – Mathieu Jul 02 '18 at 11:48
  • 1
    As it is in a for loop, you just need the current file in the `except` block like `print(csv)` here. – vahdet Jul 02 '18 at 11:48
  • @hellow, yes, but how do I get it to just return the name of the csv throwing the error – Sam Jul 02 '18 at 11:50
  • @vahdet I tried that under the Except KeyError and it return the file list, not the file name where the error occured. – Sam Jul 02 '18 at 11:51
  • 1
    Were you sure you used `print(csv)` and not `print(csvList)` inside the except block? – jedwards Jul 02 '18 at 11:53
  • Can you give an example of ``csv`` and ``csvList``? If you add ``print(csv)``, does it print all paths at once or one per line? – MisterMiyagi Jul 02 '18 at 12:03

2 Answers2

4

Assuming csvList contains list of input file paths:

for csv in csvList:
    ....
    try:
        ...
    except KeyError:
         print('Error within file {}, please review files'.format(csv))
running.t
  • 5,329
  • 3
  • 32
  • 50
  • Works, but prints all CSV file names. Not the one causing the error – Sam Jul 02 '18 at 11:53
  • 2
    Probably means all of your csv files are botched – rst-2cv Jul 02 '18 at 11:54
  • 1
    How can? `csv` variable contains always just one single file. This is what I suggested to print :) – running.t Jul 02 '18 at 11:55
  • Thanks @ResetACK but I did test that theory before comng to this site – Sam Jul 02 '18 at 11:56
  • 2
    @Sam, like running.t said, `csv` will only ever contain a single value - that's how `for` loops work in Python (or any language for that matter). If your whole list of files is being printed, then it means that each and every file in your list is throwing the exception. – rst-2cv Jul 02 '18 at 11:57
0

You could write, something like this, I guess:

for csv in csvList:
        df = pd.read_csv(csv, header=0, skip_blank_lines=True, skipinitialspace=True)\
            .dropna(how='all')

        try:
            df[0] = df[0].astype(int)
            df[1] = df[1].astype(int)
            df[2] = df[2].astype(int)
            df[3] = df[3].astype(int)

            report_path = 'UPC_Ready_for_Import'
            if not os.path.exists(report_path):
                os.makedirs(report_path)

            file_name = os.path.join(report_path, csv + '_import.csv')
            df.to_csv(file_name, index=False)

        except KeyError:
            print('Error within file', file_name ', please review files')

The main idea is to store the file name in a variable file_name and use it in the except block.

mgross
  • 550
  • 1
  • 7
  • 24