0

I have a dataframe with 3 columns: id, single, age.

    'id': [1, 1, 1, 2, 2, 3, 3, 4],
    'single': ['y', '', '', '', 'n', 'n', '', 'y'],
    'age': ['', 22, '', 34, '', 22, '', 43]
}

Some rows of the same id have NaN values and others have info.

I want something like:

data = {
    'id': [1,2,3, 4],
    'single': ['y' 'n', 'n', 'y'],
    'age': [22,  34, 22,  43]
}

Is it possible?

Lucas
  • 25
  • 8

1 Answers1

1

Just use groupby and first. Replace the '' with np.nan before that.

df.replace('', np.nan, inplace=True)
df_new = df.groupby('id', as_index=False).first()
NYC Coder
  • 7,424
  • 2
  • 11
  • 24