6

Suppose I have four successively arranged columns as a part of a data frame and I want to replace all the negative values in these 4 columns by another value (-5 let's say), how do I do it?

T1   T2  T3  T4
20   -5  4   3
85  -78  34  21
-45  22  31  75
-6   5   7  -28

Logically, I was hoping this would work. But, it doesn't.

for i in df.iloc[:,df.columns.get_loc("T1"):df.columns.get_loc("T1")+4]<0:
    for j in df[i]:
        if j<0:
            j=-5
Mihai Alexandru-Ionut
  • 47,092
  • 13
  • 101
  • 128
hegdep
  • 596
  • 1
  • 6
  • 16

3 Answers3

5

You can just use indexing by applying a condition statement.

cols = ['T1','T2','T3','T4']
df[df[cols] < 0] = -5

Output

In [35]: df
Out[35]:
     T1  T2  T3  T4
  0  20  -5   4   3
  1  85  -5  34  21
  2  -5  22  31  75
  3  -5   5   7  -5

In your example you're just replacing the value of variable. You need to replace one cell's value using at method.

for i in df.iloc[:,df.columns.get_loc("T1"):df.columns.get_loc("T1")+4]<0:
for index, j in enumerate(df[i]):
    if j<0:
        df.at[index, i] = -5
Mihai Alexandru-Ionut
  • 47,092
  • 13
  • 101
  • 128
  • I edited my question. My data frame has 200+ columns and I want to edit only 4/5 columns arranged successively. – hegdep May 03 '18 at 06:37
  • Thanks! This is really simple! Just wondering, though. Why doesn't it work when I write it out the way I did? – hegdep May 03 '18 at 06:44
  • @hegdep, because in your example you just replace the value of variable. You can use `at` method in order to update the cell value. Have a look to updated answer. – Mihai Alexandru-Ionut May 03 '18 at 07:02
3

Use mask - for all columns:

df = df.mask(df < 0, -5)

Or if need specify columns:

cols = ['T1','T2','T3','T4']
df[cols] = df[cols].mask(df[cols] < 0, -5)
print (df)
   T1  T2  T3  T4
0  20  -5   4   3
1  85  -5  34  21
2  -5  22  31  75
3  -5   5   7  -5

With similar approach with select by positions like in question:

pos = df.columns.get_loc("T1")
df.iloc[:,pos:pos+4] = df.iloc[:,pos:pos+4].mask(df < 0, -5)

EDIT:

Looping solutions in pandas are not recommended, if exist vectorized solution, because slow.

But it is possible, your code should be change with iterrows and iteritems:

for i, row in df.iloc[:,df.columns.get_loc("T1"):df.columns.get_loc("T1")+4].iterrows():
    for j,k in row.iteritems():
        if k < 0:
            df.loc[i, j] = -5
print (df)
   T1  T2  T3  T4
0  20  -5   4   3
1  85  -5  34  21
2  -5  22  31  75
3  -5   5   7  -5
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

You can use indexing:

c = ['T1','T2','T3','T4']
df[df[c] < 0] = -5

Or clip:

In [47]: df[c].clip(lower=-5)
Out[47]:
   T1  T2  T3  T4
0  20  -5   4   3
1  85  -5  34  21
2  -5  22  31  75
3  -5   5   7  -5
user3483203
  • 50,081
  • 9
  • 65
  • 94
  • I edited my question. My data frame has 200+ columns and I want to edit only 4/5 columns arranged successively. – hegdep May 03 '18 at 06:37