0

I'm trying to cleanup some data

The dataframe currently look something like this:

    id  data data2
0   12  NaN  50.0
1   12  a    50.0
2   12  a    NaN
3   52  b    NaN
4   52  NaN  20.0
5   52  NaN  20.0

I'd like to collapse the rows to remove duplicate entries and keep only what's valid grouping on ID in this specific case, disregarding NaNs and and up with:

    id  data data2
0   12  a    50
1   52  b    20
velxundussa
  • 91
  • 1
  • 10

1 Answers1

2

You need:

df.groupby('id', as_index=False).first()

Output:

    id  data    data2
0   12  a      50.0
1   52  b      20.0
harvpan
  • 8,571
  • 2
  • 18
  • 36