0

I have trying to add a column D to below df and add condition like this: if column C is in Shanghai, then D is Asia, if column C is SFA, then D is America...take these two as an example, my code as following:

  A     B   C      
0 Joe   23   SFA
1 Amy   40   SFA
2 Jenny 34   SFA
3 Kitty  20  Shanghai
4 David  19  Shanghai
...

code:

df['D'] = np.where(
    df['C'] == 'SFA','America',
    np.where(df['C'] =='Shnaghai','Asia','Other'
    )
)

But it keeps giving an error showing: KeyError:'C' I have no idea why it give me this error always as I am pretty sure the data frame is pandas and the column C is being converted to string. Can anyone provide me any insights?

cs95
  • 379,657
  • 97
  • 704
  • 746
Jennie
  • 33
  • 4
  • It looks like you have the code for conditional creation column correct, but the column name has some spaces in it. You can verify that is the issue by printing out `df.columns.tolist()`. You can fix the issue using `df.columns = df.columns.str.strip()`. – cs95 Nov 02 '20 at 03:08
  • it gives me the same error after I doing these. When I apply the first formula, it doesn't return column C, but another one that I don't need. – Jennie Nov 02 '20 at 03:22
  • What does `print(df.columns.tolist())` return, please copy-paste it here. – cs95 Nov 02 '20 at 03:24
  • it returns another column ['Amount'] which I didn't post in this dataframe, I don't need this column for now. – Jennie Nov 02 '20 at 03:25
  • So it returns [A, B, C, Amount]? Can you paste the output here? – cs95 Nov 02 '20 at 03:27
  • It doesn't return any dataframe, just the column name: ['Amount']. Nothing else. – Jennie Nov 02 '20 at 03:28
  • So you are trying to access column names that don't exist, don't you see the issue there? – cs95 Nov 02 '20 at 03:29
  • If these are in fact levels in the index and not actual column names, you should reset the index first: `df = df.reset_index()` – cs95 Nov 02 '20 at 03:30
  • I am pretty sure all the columns are in the dataframe, they appear the exact names I put in the code and they are not in the index level. – Jennie Nov 02 '20 at 03:44
  • In that case it should be reflected in the output of `print(df.columns.tolist())` At this point we are just talking in circles, so without more context I'm afraid your issue is not reproducible for anyone here, sorry. – cs95 Nov 02 '20 at 03:48

0 Answers0