Merging pandas dataframe rows with similar values in more than one columns

Question

I have a dataframe like this

ID	Performed Time	Reported Time
101	13:05.	15.02.
121	14.05.	16.10.
101	14.20.	15.02.

I want to filer rows if the ID and the Reported Time are the same. ie the resultant dataframe should be

ID	Reported Time
101	15.02.
121	16.10.

I tried using groupby to no avail.

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop_duplicates.html — Emma, Mar 28 '22 at 17:17

score 0 · Answer 1 · answered Mar 28 '22 at 17:21

0

d = {'ID': ['13', '13','23', '24'], 'Reported_time': ['13.22.57', '13.22.57','13.23.44', '13.24.01']}
df = pd.DataFrame(data=d)
df2=df.groupby(['ID','Reported_time']).nunique()
df2

result:

answered Mar 28 '22 at 17:21

Ran A

746
3
7
19

Thank you so much. This works. However I have other columns in the dataframe. And the values in those columns are shown as Boolean in the resultant df. I would want to retain the original values and also retain the original index. – dratoms Mar 28 '22 at 17:53

score 0 · Answer 2 · answered Mar 28 '22 at 17:26

You just need distinct():

>>> from datar.all import f, tibble, distinct
>>> df = tibble(
...     ID=[101, 121, 101],
...     **{
...         "Performed Time": ["13:05.", "14.05.", "14.20."],
...         "Reported Time": ["15.02.", "16.10.", "15.02."]
...     }
... )
>>> 
>>> df >> distinct(f.ID, f["Reported Time"])
       ID Reported Time
  <int64>      <object>
0     101        15.02.
1     121        16.10.

I am the author of datar, the grammar of data manipulation in python, which wraps pandas APIs, and also with modin support now.

score 0 · Answer 3 · edited Mar 28 '22 at 19:18

0

df[["ID", "Reported Time"]].drop_duplicates()

edited Mar 28 '22 at 19:18

joanis

10,635
14
30
40

answered Mar 28 '22 at 17:59

scapula13

50
1
5

1

Or `df.drop_duplicates(subset=['ID', 'Reported Time'])` – Emma Mar 28 '22 at 18:48
Thanks Emma and scapula. Emma's solution retains the rest of the columns. Thank you. – dratoms Mar 29 '22 at 07:05

Merging pandas dataframe rows with similar values in more than one columns

3 Answers3