How to change a dataframe element based on condition on another column in pandas

Question

I have looked around (e.g. here), but I can't understand why my code is not working as expected. I have a pandas dataframe and I'd like to add a column that marks the last zero element in column B above a non-zero element.

df = pd.DataFrame({'B':[0,0,1,0,1,0,0,1]})
N = len(df.index)
df['C'] = N*[False]
for i in range(N-1):
    if (df.iloc[i]['B']==0 and df.iloc[i+1]['B']>0):
        df.iloc[i]['C']=True

In spite of having the condition satisfied 3 times, column C is still all false, and I also get a warning that I don't understand:

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

Any ideas?

you can read about the SettingWithCopyWarning [here](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas) and I think to solve it in your case, it would be `df.loc[i,'C']=True` at the last line. But your problem has a way more efficient answer to it, sure someone will answer for that :) — Ben.T, Jun 23 '20 at 17:48
df['C']=np.where(df.B.eq(0) & df.B.shift().gt(0), True,False) — BENY, Jun 23 '20 at 17:52

user · Accepted Answer · 2020-06-23T18:11:30.687

For dataframes with mixed types (like here), it seems pandas creates copies when using iloc and similar functions. Instead of chain indexing, you can do this:

df.iloc[i, df.columns.get_loc('C')]=True

or

df.at[i, 'C'] = True

However, I'd suggest replacing your for loop with this, which looks much more simple to me:

df['C'] = [df.iloc[i]['B'] == 0 and df.iloc[i+1]['B'] > 0 for i in range(N - 1)] + [False]

Edit: If you actually want to find the last occurrence of a non-zero element before an element that's zero, try this:

df['C'].where(df['C']).last_valid_index()

This outputs 6

score 0 · Answer 2 · answered Jun 23 '20 at 17:55

0

sort by index descending and then loop to find the first row.

df=df.sort_index(ascending=False)
df['C'] = False
for i in range(len(df['B'])):
    if df.iloc[i-1,0] - 1 == df.iloc[i,0]:
        df.iloc[i,1] = True
        break
df=df.sort_index(ascending=True)
df

    B   C
0   0   False
1   0   False
2   1   False
3   0   False
4   1   False
5   0   False
6   0   True
7   1   False

answered Jun 23 '20 at 17:55

David Erickson

16,433
2
19
35

The OP said that the condition it met 3 times, so one should get 3 `True` in C. your method return only one `True` so I think there is a problem somewhere – Ben.T Jun 23 '20 at 18:02
@Ben.T you might be right. OP he also did say: "the `last` zero element in column B above a non-zero element." I think better wording might be: "Any zero elements before a one element." – David Erickson Jun 23 '20 at 18:05

frank · Answer 3 · 2020-06-23T17:58:56.247

0

You can change df.iloc[i]['C']=True from inside your for loop to df.loc[i, 'C'] = True to make it work.

But I would rather use the following to make it a bit more efficient:

df = pd.DataFrame({'B':[0,0,1,0,1,0,0,1]})

df['Check'] = df['B'].shift(-1)
df['C'] = df['B'] < df['Check']

Out:
   B  Check      C
0  0    0.0  False
1  0    1.0   True
2  1    0.0  False
3  0    1.0   True
4  1    0.0  False
5  0    0.0  False
6  0    1.0   True
7  1    NaN  False

edited Jun 23 '20 at 17:58

answered Jun 23 '20 at 17:57

frank

389
1
13

" the last zero element in column B above a non-zero element." – David Erickson Jun 23 '20 at 17:58
@DavidErickson I think the answer you posted has some issue. It does not reflect OP's logic. – frank Jun 23 '20 at 18:04
2

potentially, bear in mind though "the last zero element in column B above a non-zero element." I think better wording might be: "Any zero elements before a one element." – David Erickson Jun 23 '20 at 18:08

How to change a dataframe element based on condition on another column in pandas

3 Answers3