1

**Using Pandas 1.4.2, Python 3.9.12

I have a data set where the column values are represented as 0 or 1 which stand for 'No' and 'Yes', respectively.

  Scholarship     Hipertension   Diabetes   Alcoholism  SMS_received    
0     0               1              0          0            0  
1     0               0              0          0            0  
2     0               0              0          0            0  
3     0               0              0          0            0  
4     0               1              1          0            0  

I am attempting to create a custom function to replace the 0's and 1's all at once with 'No' and 'Yes', respectively.

What I have written at this point is as follows:

def replace_values(data_frame, column, being_replaced, replacement_value):
    data_frame[column] = df[column].replace(to_replace=being_replaced, value= 
    replacement_value)
return df

As an example, I would like to be able to put all the column names in and the values being replaced and replacement values so the function will do everything in one fell swoop. Such as:

replace_values(df, [*list_of_columns*], [0, 1], ['No', 'Yes'])

Is this even possible? Do I need to put a loop in there as well? I have tried it a couple times with only one column name as opposed to a list and it works, but it replaces every 0 and 1 with 'No' and 'Yes' regardless of column, which is great, but not what I am trying to do. Any help is appreciated.

  • I see the data_frame[column] = df[column] issue. It is a typo, in my code it reads as: data_frame[column] = data_frame[column] – Wildo_Baggins311 Jul 07 '22 at 23:49
  • Does this answer your question? [Remap values in pandas column with a dict, preserve NaNs](https://stackoverflow.com/questions/20250771/remap-values-in-pandas-column-with-a-dict-preserve-nans) – Ignatius Reilly Jul 08 '22 at 00:00

2 Answers2

2

here is a couple of solutions.

to use replace:

df.replace({1: 'Yes', 0: 'No'})

use where, which keeps the value that fulfills the condition of the first argument and changes everything else to the value of the second argument:

df = df.where(df == 1, 'No')
df = df.where(df == 'No', 'Yes')

use boolean masking:

df[df == 0] = 'No'
df[df == 1] = 'Yes'
Qdr
  • 703
  • 5
  • 13
  • 1
    Thank you! I am still so new I didn't realize I could use 'replace' to do everything at once and didn't need to make my own function. Much easier to use your first method, but they all are super helpful. – Wildo_Baggins311 Jul 08 '22 at 00:20
  • welcome! pandas is full of methods that do everything. Just go to pandas API reference and you will find everything with examples. – Qdr Jul 08 '22 at 00:22
1

This should work for you:

def replace_values(data_frame):
  return data_frame.astype(bool)

or since you want to be able to specify the column names you can try something like this:

def replace_values(data_frame, list_of_columns):
  for col in list_of_columns:
    data_frame[col] = data_frame[col].astype(bool)
  return data_frame
Grg Alx
  • 83
  • 7