I have a code that gets all of the csv files in a folder and appends all of the files on top of one another. So for example, I'd have 5 files and each of the files have the same columns names. After which, the code should get 1 column (F_NAN_KEY) out of the appended data and filter out the unique values in that column. Note: all of the sheets in the files have the same name as their file name.
Code:
RTDJuice2 = #path#
df = []
for file in os.listdir(RTDJuice2):
if file.endswith('.csv'):
print('Loading file {0}...'.format(file))
df.append(pd.read_excel(os.path.join(RTDJuice2,file), sheet_name=file))
for i in range(0, len(df)):
df_master = pd.concat(df[i], axis=0)
if i == 0:
finaldata2 = df_master
else:
finaldata2 = finaldata2.append(df_master)
finaldata2.to_csv('#output file#.csv', index=False)
uniqueNAN = finaldata2['F_NAN_KEY'].drop_duplicates()
uniqueNAN.to_excel('#outputfile2#.xlsx', index =False)
I am getting the error: "ValueError: Excel file format cannot be determined, you must specify an engine manually.". I've tried closing all of the excel files but still getting an error message. Note sure how to go about this. Any help or a different approach would be much appreciated, thank you.
Tried running the code listed above but getting the error message: "ValueError: Excel file format cannot be determined, you must specify an engine manually"