I am working on a GUI that edits and compares CSV files for my monthly user uploads in my wellness data job. I have already built out this operation for one of our client businesses, but now I am trying to reconfigure it to work for our other client business. It currently works without errors, but for some reason my new_users dataframe prints as blank after merging it. For my other package that I built, this exact code works correctly. My first thought was perhaps it is because UniqueIDs need to be 10 characters, but even after putting in the character requirements I am still getting an empty dataframe for new_users. This has me stumped.
def client_merge(ef_in, ul_in):
pd.set_option('mode.chained_assignment', None)
ef_in['UniqueID'] = ef_in['UniqueID'].astype(object)
ef_in['HireDate'] = ef_in['HireDate'].astype(object)
ef_in['DateOfBirth'] = ef_in['DateOfBirth'].astype(object)
ul_in['UniqueID'] = ul_in['UniqueID'].astype(object)
ul_in['Action'] = ul_in['Action'].astype(object)
ul_in['ZipCode'] = ul_in['ZipCode'].astype(object)
df = pd.concat(([ef_in, ul_in]), axis=0, ignore_index=True, sort=False)
df.drop_duplicates(subset=["UniqueID"], keep=False, inplace=True)
df['UniqueID'] = df['UniqueID'].str.rjust(10, "0")
ef_in['UniqueID'] = ef_in['UniqueID'].str.rjust(10, "0")
ul_in['UniqueID'] = ul_in['UniqueID'].str.rjust(10, "0")
print(ef_in)
new_users = df.merge(ef_in)
#print(new_users)
disable_users = df.merge(ul_in)
#print(ul_in)
disable_users['Action'].fillna('Disable', inplace=True)
ready_to_print_file = pd.concat([new_users, disable_users], ignore_index=False)
rtpf1 = ready_to_print_file[ready_to_print_file["FirstName"].str.contains("companytest") == False]
rtpf2 = rtpf1[rtpf1["FirstName"].str.contains("Clarks", "test") == False]
rtpf2.to_csv(path, header=True, index=False)
I have been playing with it for two hours cross-referencing hand-done comparison in Excel, and my file, and it should definitely not be coming back blank. I have attached the working code for my other client below:
def client_merge(ef_in, ul_in):
pd.set_option('mode.chained_assignment', None)
ef_in['EmployeeId'] = ef_in['EmployeeId'].astype(object)
ul_in['EmployeeId'] = ul_in['EmployeeId'].astype(object)
ul_in['Action'] = ul_in['Action'].astype(object)
ul_in['PrimaryMemberEmployeeId'] = ul_in['PrimaryMemberEmployeeId'].astype(object)
ul_in['ZipCode'] = ul_in['ZipCode'].astype(object)
df = pd.concat(([ef_in, ul_in]), axis=0, ignore_index=True, sort=False)
df.drop_duplicates(subset=["EmployeeId"], keep=False, inplace=True)
new_users = df.merge(ef_in)
disable_users = df.merge(ul_in)
disable_users['Action'].fillna('Disable', inplace=True)
ready_to_print_file = pd.concat([new_users, disable_users], ignore_index=False)
rtpf1 = ready_to_print_file[ready_to_print_file["FirstName"].str.contains("admin") == False]
rtpf2 = rtpf1[rtpf1["FirstName"].str.contains("client") == False]
rtpf2.to_csv(path, header=True, index=False)