0

I have a data frame where there's multiple options for a certain index (1-M relationship) - e.g. States as Index and Counties as respective columns. I want to group it in a way that creates just one column but with all the values. This is a basic transformation but somehow I can't get it right.

Sorry I don't know how to insert code that actually is already run so here I present the code to create example DFs as to what I'd like to create.

pd.DataFrame({'INDEX': ['INDEX1','INDEX2','INDEX3'],
                    'col1': ['a','b','d'],
                   'col2': ['c','f',np.nan],
                   'col3': ['e',np.nan,np.nan]})

and I want it to transform it so that I end up with this data frame:

pd.DataFrame({'INDEX': ['INDEX1','INDEX1','INDEX1','INDEX2','INDEX2','INDEX3'],
                    'col1': ['a','c','e','b','f','d']})
JachymDvorak
  • 129
  • 8

1 Answers1

0

You can use melt here:

df = pd.melt(df, id_vars=['INDEX']).drop(columns=['variable']).dropna()
print(df)

    INDEX value
0  INDEX1     a
1  INDEX2     b
2  INDEX3     d
3  INDEX1     c
4  INDEX2     f
6  INDEX1     e
NYC Coder
  • 7,424
  • 2
  • 11
  • 24