I have city addresses I need to put together and find all duplicates. I got to a point where I can find all the duplicates in excel files, easy so far. But I have to change each city in the code to search each file. How do I search each file without having to change the city in the code and then save it of course. I want to merge them which I can but I can't figure out why they create their own 3 columns and don't just merge with the 'A','B' & 'C' columns already there. Maybe Pandas isn't the best library to do this with and a better one can be suggested.
import os
file_df = pd.read_excel("Kermit.xlsx")
file_df.duplicated(subset = 'Address', keep = False)
file_df.drop_duplicates(subset= 'Address',inplace= True)
City = file_df.to_excel("Kermit2.xlsx", index= False)
# path = os.getcwd()
# files = os.listdir(path)
# print(files)
# files_xlsx = [f for f in files if f[-4:] == 'xlsx']
# print(files_xlsx)
# df = pd.DataFrame()
# for f in files_xlsx:
# data = pd.read_excel(f, 'Sheet1')
# df = df.append(data)`import os