1

I have a dataframe like this

ID Performed Time Reported Time
101 13:05. 15.02.
121 14.05. 16.10.
101 14.20. 15.02.

I want to filer rows if the ID and the Reported Time are the same. ie the resultant dataframe should be

ID Reported Time
101 15.02.
121 16.10.

I tried using groupby to no avail.

dratoms
  • 159
  • 11

3 Answers3

0
d = {'ID': ['13', '13','23', '24'], 'Reported_time': ['13.22.57', '13.22.57','13.23.44', '13.24.01']}
df = pd.DataFrame(data=d)
df2=df.groupby(['ID','Reported_time']).nunique()
df2

result: enter image description here

Ran A
  • 746
  • 3
  • 7
  • 19
  • Thank you so much. This works. However I have other columns in the dataframe. And the values in those columns are shown as Boolean in the resultant df. I would want to retain the original values and also retain the original index. – dratoms Mar 28 '22 at 17:53
0

You just need distinct():

>>> from datar.all import f, tibble, distinct
>>> df = tibble(
...     ID=[101, 121, 101],
...     **{
...         "Performed Time": ["13:05.", "14.05.", "14.20."],
...         "Reported Time": ["15.02.", "16.10.", "15.02."]
...     }
... )
>>> 
>>> df >> distinct(f.ID, f["Reported Time"])
       ID Reported Time
  <int64>      <object>
0     101        15.02.
1     121        16.10.

I am the author of datar, the grammar of data manipulation in python, which wraps pandas APIs, and also with modin support now.

Panwen Wang
  • 3,573
  • 1
  • 18
  • 39
0
df[["ID", "Reported Time"]].drop_duplicates()
joanis
  • 10,635
  • 14
  • 30
  • 40
scapula13
  • 50
  • 1
  • 5