2

I have the following pandas table in Python

df1 = pd.DataFrame(np.array([['Jim', 1, 1, 1, 0], ['Jack', 1, 0, 0, 1], ['Joe', 0, 1, 1, 1]]),
                   columns=['name', 'slot1', 'slot2', 'slot3', 'slot4'])

And I want to add a new column which contains the names of the columns when the value in each column is 0.

The desired outcome should be the following.

df2 = pd.DataFrame(np.array([['Jim', 1, 1, 1, 0, 'slot4'], ['Jack', 1, 0, 0, 1, 'slot2,slot3'], ['Joe', 0, 1, 1, 1, 'slot1']]),
                   columns=['name', 'slot1', 'slot2', 'slot3', 'slot4', 'error'])

I tried the following codes but they don't give the desired outcome:

df1['error'] = df1.loc[:,"slot1":"slot4"].apply(lambda row: row[row == 0].index, axis=1)

df1['error'] = df1.loc[:,"slot1":"slot4"].isin(["0"]).idxmax(1)

Could you please suggest a solution on this task? Thank you in advance.

  • 1
    IIUC `df1.mask(df1.ne('0')).stack().reset_index(1).groupby(level=0)['level_1'].agg(','.join)` quite a few ways to do this and has been asnwered before let me find a dupe also `df1.eq('0').dot(df1.columns + ';').str.rstrip(';')` – Umar.H Oct 06 '20 at 15:20
  • 1
    @Manakin your answer was exactly what I was looking for. Thank you! – Konstantinos Zeimpekis Oct 06 '20 at 15:31
  • no problemo, thanks for the well formatted question! – Umar.H Oct 06 '20 at 15:36

0 Answers0