0

I am trying to concatenate two excel files with the same column names together, but there seems to be a problem as there are new empty columns/spaces being added to my new excel file, and i don't know why.

I used pd.concat() function which was supposed to concat the two files into one single sheet and make a new file, but when it adds the table in the second file to the first file, new columns/spaces are added to the new merged file.

file_list = glob.glob(path + "/*.xlsx")
dfs =  pd.DataFrame()

dfs = [pd.read_excel(p,) for p in file_list]
print(dfs[0].shape)
res = pd.concat(dfs)

That is a snippet of my code

I also added a picture of what the result i am getting now looks like This is what i am getting for the new file now which is wrong

Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52
joe1234
  • 3
  • 3

1 Answers1

0

Concat respects the column names, so is not like a plain vector concatenate, try to check if the column names are the same among all your source files. If no, you can normalize them, rename them or move to a vector base format like numpy arrays.

Franco Milanese
  • 402
  • 3
  • 7