3

I have been trying to melt this columns

d = {'key': [1,2,3,4,5], 'a': ['None','a', 'None','None','None'], 'b': ['None','None','b','None','None'],'c':['None','None','None','c','c']}

df = pd.DataFrame(d)

I need to look like this

key letter
1 None
2 a
3 b
4 c
5 c

I tried:

df = pd.melt(df,id_vars=['key'], var_name = 'letters')

but i got:

key letters value
1 a None
2 a a
3 a None
4 a None
5 a None
1 b None
2 b None
3 b b
4 b None
5 b None
1 c None
2 c None
3 c None
4 c c
5 c c
Stephenie
  • 31
  • 3

1 Answers1

2

If need get first non None value per rows after key column use DataFrame.set_index with replace possible None strings with back filling missing values and selected first column by position, last use Series.reset_index:

df = (df.set_index('key')
        .replace('None', np.nan)
        .bfill(axis=1)
        .iloc[:, 0]
        .reset_index(name='letter')))
print (df)
   key letter
0    1    NaN
1    2      a
2    3      b
3    4      c
4    5      c

If possible multiple non None value per rows use:

d = {'key': [1,2,3,4,5], 
     'a': ['None','a', 'None','None','None'], 
     'b': ['None','b','b','None','None'],
     'c':['None','None','None','c','c']}

df = pd.DataFrame(d)

df = (df[['key']].join(df.set_index('key')
        .replace('None', np.nan)
        .stack()
        .groupby(level=0)
        .agg(','.join)
        .rename('letter'), on='key'))
print (df)
   key letter
0    1    NaN
1    2    a,b
2    3      b
3    4      c
4    5      c

Or:

df = (df.set_index('key')
        .replace('None', np.nan)
        .apply(lambda x: ','.join(x.dropna()), axis=1)
        .replace('', np.nan)
        .reset_index(name='letter'))

print (df)
   key letter
0    1    NaN
1    2    a,b
2    3      b
3    4      c
4    5      c
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252