1

I have a dataframe like this:

A =   ID Material1 Materia2 Material3 
      14   0        0        0
      24   1        0        0 
      12   1        1        0
      25   0        0        2

I want to have all information in one column like this:

A =   ID  Materials
      14   Nan 
      24   Material1
      12   Material1 
      12   Material2
      25   Material3 
      25   Material3 

can anyone help write a function please !

chero
  • 61
  • 6

1 Answers1

0

Use DataFrame.melt with repeat rows by counts with Index.repeat and DataFrame.loc:

df1 = df.melt('ID', var_name='Materials')
df1 = df1.loc[df1.index.repeat(df1['value'])].drop('value', axis=1).reset_index(drop=True)
print (df1)
   ID  Materials
0  24  Material1
1  12  Material1
2  12   Materia2
3  25  Material3
4  25  Material3

EDIT: For add only 0 Materials with missing values use DataFrame.merge with left join by original df['ID'] in one column DataFrame withoiut duplications by DataFrame.drop_duplicates:

df1 = df.melt('ID', var_name='Materials')

df0 = df[['ID']].drop_duplicates()
print (df0)
   ID
0  14
1  24
2  12
3  25

df2 = df1.loc[df1.index.repeat(df1['value'])].drop('value', axis=1).reset_index(drop=True)

df2 = df0.merge(df2, on='ID', how='left')
print (df2)
   ID  Materials
0  14        NaN
1  24  Material1
2  12  Material1
3  12   Materia2
4  25  Material3
5  25  Material3
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    It worked fine , but how can i save the lines that have no materials , cauz now i only have ID with at least one material ? i Just edited my question – chero Apr 15 '20 at 09:47