2

I have a dataframe with three columns

 a b c
[1,0,2] 
[0,3,2] 
[0,0,2] 

and need to create a fourth column based on a hierarchy as follows:

If column a has value then column d = column a

if column a has no value but b has then column d = column b

if column a and b have no value but c has then column d = column c

 a b c d
[1,0,2,1] 
[0,3,2,3] 
[0,0,2,2] 

I'm quite the beginner at python and have no clue where to start.

Edit: I have tried the following but they all will not return a value in column d if column a is empty or None

df['d'] = df['a']
df.loc[df['a'] == 0, 'd'] = df['b']
df.loc[~df['a'].astype('bool') &  ~df['b'].astype('bool'), 'd'] = df['c']
df['d'] = df['a']
df.loc[df['a'] == None, 'd'] = df['b']
df.loc[~df['a'].astype('bool') &  ~df['b'].astype('bool'), 'd'] = df['c']
df['d']=np.where(df.a!=0, df.a,\
                                          np.where(df.b!=0,\
                                                   df.b, df.c)

3 Answers3

0

Try this (df is your dataframe)

df['d']=np.where(df.a!=0 and df.a is not None, df.a, np.where(df.b!=0 and df.b is not None, df.b, df.c))

>>> print(df)
   a  b  c  d
0  1  0  2  1
1  0  3  2  3
2  0  0  2  2
IoaTzimas
  • 10,538
  • 2
  • 13
  • 30
  • I have tried this but it does not return a value in column 'd' if column a is 0 or None – DCM_paddington Nov 03 '20 at 18:26
  • I have added some check for None values. If your None values are in different format (eg as text 'None' or 'Nan', etc), adjust the code or let me know – IoaTzimas Nov 03 '20 at 18:30
0
import numpy as np
import pandas as pd

df = pd.DataFrame([[1,0,2], [0,3,2], [0,0,2]], columns = ('a','b','c'))
print(df)

df['d'] = df['a']
df.loc[df['a'] == 0, 'd'] = df['b']
df.loc[~df['a'].astype('bool') &  ~df['b'].astype('bool'), 'd'] = df['c']
print(df)
Aaj Kaal
  • 1,205
  • 1
  • 9
  • 8
0

A simple one-liner would be,

df['d'] = df.replace(0, np.nan).bfill(axis=1)['a'].astype(int)

Step by step visualization

Convert no value to NaN

     a    b  c
0  1.0  NaN  2
1  NaN  3.0  2
2  NaN  NaN  2

Now backward fill the values along rows

     a    b    c
0  1.0  2.0  2.0
1  3.0  3.0  2.0
2  2.0  2.0  2.0

Now select the required column, i.e 'a' and create a new column 'd'

Output

   a  b  c  d
0  1  0  2  1
1  0  3  2  3
2  0  0  2  2
Vishnudev Krishnadas
  • 10,679
  • 2
  • 23
  • 55
  • Thank you, however, there are other columns in the whole df that don't need changing. I will try to create a new df, do this and merge the dataframes again – DCM_paddington Nov 04 '20 at 10:42
  • The other columns don't really change, I have depicted for illustration/understanding only. The code works as is creating new column without changing other data. @DCM_paddington – Vishnudev Krishnadas Nov 05 '20 at 05:34
  • Thank you. this solved my problem. Since the columns a/b/c in the original df were in a unfortunate order (b/a/e/c/) I have created a new df, used this method and implemented column d in the original df. – DCM_paddington Nov 09 '20 at 10:09