Find specific field values in CSV (numerical) and convert them to text values

Question

My CSV file is of this format:

sidebars,notes,riskOthers,seriousEvents,goodCatches,harms
,SAFE; 2 moveouts; 0 discharges; ED patient awaiting bed in MAT,0,0,0,0
,Staffing,0,0,0,0
,,1,0,0,0
,,0,0,0,0
,,0,0,0,0
,Staffing needs,0,0,0,0
,Safe,1,0,0,0
,1- 1-1/ Staffing @ 3p- 7a,0,0,0,0
SB- Central Stores,,2,0,0,0
SB - ED Dr. G,,0,0,0,0
,,0,0,0,0
,1 pt in restraints,0,0,0,0
,1 Pt in Restraints,0,0,0,0
SB- Pharmacy,@ Risk - Staffing/ Security with Pt who had drug paraphernalia/ 1-1-1,1,0,0,0

I want to select the values in the last four columns that are greater than 1 and replace them with 1. This is code I tried but I failed.

data = pd.read_csv('reordered.csv')
df = pd.DataFrame(data, columns = ['sidebars','notes','riskOthers','seriousEvents', 'goodCatches', 'harms'])

# Values to find and their replacements
findL = ['3', '2', '4', '5', '6']
replaceL = ['1', '1', '1', '1', '1']

# Select column (can be A,B,C,D)
col = 'riskOthers';

# Find and replace values in the selected column
df[col] = df[col].replace(findL, replaceL)

Here, in this code I am trying to replace all the values greater than 1 to 1. But I get the type mismatch error.

The lists findL and replaceL are both string data types. Where in the file you have the last four columns as integer datatypes. There is a more efficient way to get your goal, but as a start use `findL=[3,2,4,5,6]` & `replaceL=[1,1,1,1,1]` This will at least show you how your current code is failing. — jtweeder, Oct 26 '18 at 14:05
Also what about the values that are 1, do these get changed to the column names as well?? If you can add a little goal output example that may help to communicate what you end goal of this effort would be. — jtweeder, Oct 26 '18 at 14:12
You are right. I was using string in place of int. Thank you for pointing it out. — ms1941, Oct 26 '18 at 14:23

score 1 · Answer 1 · answered Oct 26 '18 at 14:30

1

Here's a vectorised approach via pd.DataFrame.mask:

values = df.iloc[:, -4:]
df.iloc[:, -4:] = values.mask(values > 1, 1)

print(df.iloc[:, -4:])

    riskOthers  seriousEvents  goodCatches  harms
0            0              0            0    0.0
1            0              0            0    0.0
2            1              0            0    0.0
3            0              0            0    0.0
4            0              0            0    0.0
5            0              0            0    0.0
6            1              0            0    0.0
7            0              0            0    0.0
8            1              0            0    0.0
9            0              0            0    0.0
10           0              0            0    0.0
11           0              0            0    0.0
12           0              0            0    0.0
13           1              0            0    NaN

answered Oct 26 '18 at 14:30

jpp

159,742
34
281
339

Is there a way to replace the string value '1-1' to '1:1' in the first two columns? – ms1941 Oct 26 '18 at 15:13
Have you tried [any of the answers here](https://stackoverflow.com/questions/23307301/pandas-replacing-column-values-in-dataframe)? – jpp Oct 26 '18 at 15:18
It does not work in my case as the field values are sentences, it should search for the string and then replace. – ms1941 Oct 26 '18 at 16:10
@MeghanaSathish, Since this is a different question, I'd advise you ask a new question, **showing all your failed attempts**. – jpp Oct 26 '18 at 16:11

score 0 · Accepted Answer · answered Oct 26 '18 at 14:05

0

Try mapping your df[col] and applying a lambda function. For example:

df[col].map(lambda x: 1 if x > 1 else 0)

answered Oct 26 '18 at 14:05

abadraja

231
2
3
8

Depending on your requirements, you can change the condition to your needs. – abadraja Oct 26 '18 at 14:06
Is there a way to replace the string value '1-1' to '1:1' in the first two columns? – ms1941 Oct 26 '18 at 14:26
Note this answer is not recommended with Pandas as it involves a Python-level loop via `pd.Series.map`. – jpp Oct 26 '18 at 14:40

Find specific field values in CSV (numerical) and convert them to text values

2 Answers2