1

I have 2 columns in my data frame. “adult” represents the number of adults in a hotel room and “children” represents the number of children in a room.

I want to create a new column based on these two. For example if df['adults'] == 2 and df[‘children’]==0 the value of the new column would be "couple with no children". And if the df['adults'] = 2 and df[‘children’]=1 the value of the new column would be "couple with 1 child".

I have a big amount of data and I want the code to run fast.

Any advice? This is a sample of the inputs and the output that I need.

adult children   family_status

2       0       "Couple without children"     
2       0       "Couple without children"
2       1       "Couple with one child"
Mishal
  • 450
  • 9
  • 27
y_e
  • 105
  • 1
  • 8

2 Answers2

1

Use np.select

df
  adult  children
0      2         0
1      2         0
2      2         1

condlist = [(df['adults']==2) & (df['children']==0),(df['adults']==2) & (df['children']==1)]
choicelist = ['couple with no children','couple with 1 child']
df['family_status'] = np.select(condlist,choicelist,np.nan)
df
   adult  children            family_status
0      2         0  couple with no children
1      2         0  couple with no children
2      2         1      couple with 1 child
Community
  • 1
  • 1
Ch3steR
  • 20,090
  • 4
  • 28
  • 58
  • Thanks I tried it but I got this error:ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). – y_e May 23 '20 at 15:47
  • @yasi_ensaf My bad I messed parenthesis while editing. Check the edited answer. ;) – Ch3steR May 23 '20 at 15:52
  • I edited the answer to handle cases that kriti mentioned too. ;) It will add `np.nan` – Ch3steR May 23 '20 at 15:58
1

You can try:

df['family_status'] = df.apply(lambda x: 'adult with no child' if (x['adult']==2 and x['children']==0)  
                        else ( 'adult with 1 child' 
                              if (x['adult']==2 and x['children']==1) else ''), axis=1)

Hope this will help you!!

Kriti Pawar
  • 832
  • 7
  • 15
  • 1
    Check this [Avoiding apply](https://stackoverflow.com/questions/54432583/when-should-i-ever-want-to-use-pandas-apply-in-my-code). Try to avoid `df.apply` as much as possible. ;) Not saying your answer is bad. – Ch3steR May 23 '20 at 16:00