I have two dataframes in pandas I wish to merge, which I can do using the following line data_cords = data_0_0.merge(data, on= "unique_id", how = "left")
I get the desired result in terms of all the variables I want together are present in the data_cords df.
The problem is my method creates many exact duplicate rows. To get my desired end product I use df = data_cords.drop_duplicates()
but all of this is very expensive memory wise which is an issue as I run the code on google colab. Is there a way I can do the merge without creating all the duplicate rows?
I have inserted screenshots of ach dataframe to the end of the question to add clarity. apologies if this is the incorrect format I am relatively new here.
df data_cords ends up like this like this with the desired columns added to the end of each sequence: