0

I have 3 different sets of files to be imported into data frames which will finally get inserted into MS sql tables. Files may or may not have header records. Before inserting into tables, I will be able to hardcode the data frame columns matching with the table columns and load into tables.

My files are already SFTP'ed and available in my windows directory with 3 different naming conventions.

I tried many different options posted here (Import multiple csv files into pandas and concatenate into one DataFrame) but nothing worked for my needs.

path = WorkDir
mypattern = "\\"+"*Category.csv"
print(WorkDir+mypattern)
allFiles = glob.glob(WorkDir+mypattern)
np_arr_list = []
for file_ in allFiles:
        print(file_)
        df = pd.read_csv(file_,index_col=None, header=0)
        np_arr_list.append(df)

big_frame = pd.concat(np_arr_list, ignore_index=True)

and I will repeat the same for other 2 file types such as *CategoryRelations.csv and *CategorProdRelations

I wonder how do I put them in a function that I can just pass the path and the pattern (file naming pattern) to the function and return the concatenated data frame.

I will further add columns to data frame before inserting to table for each file type separately.

In the above, I get the following error:

E:\ETL\Python\Client\WORKCAT\*Category.csv
    Traceback (most recent call last):
      File "c:/Users/marunachalam/Downloads/FTPGetFiles.py", line 84, in <module>
        big_frame = pd.concat(np_array_list, ignore_index=True)
      File "C:\Users\marunachalam\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\reshape\concat.py", line 255, in concat
        sort=sort,
      File "C:\Users\marunachalam\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\reshape\concat.py", line 304, in __init__
        raise ValueError("No objects to concatenate")
    ValueError: No objects to concatenate
    PS C:\Users\marunachalam>

not sure how I can achieve this and the number of records can go upto 1 or 2 millions in each type.

Thank you

Mani A
  • 11
  • 1
  • 6
  • 1
    The error says no objects to concatenate... check np_arr_list if it is not empty. You could also check the output of print(file_)... does it match what u want? when you run pd.read_csv(file_) do you get anything? – sammywemmy Feb 06 '20 at 01:56
  • Thank you! the np_arr_list is defined in this step only and not used before. Also, the print(file_) does not print anything and I see the files with valid data though in the specified directory – Mani A Feb 06 '20 at 02:13
  • _the np_arr_list is defined in this step only and not used before._ Can you check the contents of `np_arr_list` just before the `concat()`? Also, variable and function names should follow the `lower_case_with_underscores` style. – AMC Feb 06 '20 at 02:19
  • If that is the case, check your glob setup for the AllFiles variable. ensure you can read in all the files – sammywemmy Feb 06 '20 at 02:19

0 Answers0