I have lots of data in excel files. I would like to concatenate these datas into one excel file by removing duplicate records according to id column information.
df1
id name date
0 1 cab 2017
1 11 den 2012
2 13 ers 1998
df2
id name date
0 11 den 2012
1 14 ces 2011
2 4 guk 2007
I want to have below concantenated file finally.
Concat df
id name date
0 1 cab 2017
1 11 den 2012
2 13 ers 1998
1 14 ces 2011
2 4 guk 2007
I try below but it does not remove duplicates. Can anyone advise how to fix this ?
pd.concat([df1,df2]).drop_duplicates().reset_index(drop=True)
My concatenated data are as below. Duplicated ids are still on the file.
id created_at retweet_count
0 721557296757797000 2016-04-17 04:34:00 21
1 721497712726844000 2016-04-17 00:37:14 94
2 721462059515453000 2016-04-16 22:15:33 0
3 721460623285072000 2016-04-16 22:09:51 0
4 721460397241446000 2016-04-16 22:08:57 0
5 721459817651577000 2016-04-16 22:06:39 0
6 721456334894469000 2016-04-16 21:52:48 0
7 721557296757797000 2016-04-17 04:34:00 21
8 721497712726844000 2016-04-17 00:37:14 94
9 721462059515453000 2016-04-16 22:15:33 0
10 721460623285072000 2016-04-16 22:09:51 0
11 721460397241446000 2016-04-16 22:08:57 0
12 721459817651577000 2016-04-16 22:06:39 0
13 721456334894469000 2016-04-16 21:52:48 0