1

I would like to check the value of the row above and see it it is the same as the current row. I found a great answer here: df['match'] = df.col1.eq(df.col1.shift()) such that col1 is what you are comparing.

However, when I tried it, I received a SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. warning. My col1 is a string. I know you can suppress warnings but how would I check the same row above and make sure that I am not creating a copy of the dataframe? Even with the warning I do get my desired output, but was curious if there exists a better way.

import pandas as pd
data = {'col1':['a','a','a','b','b','c','c','c','d','d'],
       'week':[1,1,1,1,1,2,2,2,2,2]}
df = pd.DataFrame(data, columns=['col1','week'])
df['check_condition'] = 1
while sum(df.check_condition) != 0:
    for week in df.week:
        wk = df.loc[df.week == week]
        wk['match'] = wk.col1.eq(wk.col1.shift()) # <-- where the warning occurs
        # fix the repetitive value...which I have not done yet
        # for now just exit out of the while loop
        df.loc[df.week == week,'check_condition'] = 0
smci
  • 32,567
  • 20
  • 113
  • 146
Jack Armstrong
  • 1,182
  • 4
  • 26
  • 59
  • 1
    If `df['match'] = df.col1.eq(df.col1.shift())` gives you a warning, then `df` is a copy of some bigger dataframe. And no, you may not ignore that warning, the assignment may or may not be successful. – Quang Hoang May 08 '20 at 21:35
  • Updated the question once I did a little more research. Also I was referring to suppress warnings instead of ignore warnings. – Jack Armstrong May 08 '20 at 23:37
  • Related: [How to deal with SettingWithCopyWarning in Pandas?](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas). Always post the actual name of the error/exception. Then you can search other questions and answers for your solution. – smci May 09 '20 at 22:35

1 Answers1

1

You can't ignore a pandas SettingWithCopyWarning! It's 100% telling you that your code is not going to work as intended, if at all. Stop, investigate and fix it. (It's not an ignoreable thing you can filter out, like a pandas FutureWarning nagging about deprecation.)

Multiple issues with your code:

  • You're trying to iterate over a dataframe (but not with groupby()), take slices of it (in the subdataframe wk, which yes is a copy of a slice)...
  • then assign to the (nonexistent) new column wk['match']. This is bad, you shouldn't do this. (You could initialize df['match'] = np.nan, but it'd still be wrong to try to assign to the copy in wk)...
  • SettingWithCopyWarning is being triggered when you try to assign to wk['match']. It's telling you wk is a copy of a slice from dataframe df, not df itself. Hence like it tells you: A value is trying to be set on a copy of a slice from a DataFrame. That assignment would only get thrown away every time wk gets overwritten by your loop, so even if you could force it to work on wk it would be wrong. That's why SettingWithCopyWarning is a code smell you shouldn't be making a copy of a slice of df in the first place.
  • Later on, you also try to assign to column df['check_condition'] while iterating over the df, that's also bad.

Solution:

df['check_condition'] = df['col1'].eq(df['col1'].shift()).astype(int)

df
  col1  week  check_condition
0    a     1                0
1    a     1                1
2    a     1                1
3    b     1                0
4    b     1                1
5    c     2                0
6    c     2                1
7    c     2                1
8    d     2                0
9    d     2                1

More generally, for more complicated code where you want to iterate over each group of dataframe according to some grouping criteria, you'd use use groupby() and split-apply-combine instead.

  • you're grouping by wk.col1.eq(wk.col1.shift()), i.e. rows where col1 value doesn't change from the preceding row
  • and you want to set check_condition to 0 on those rows
  • and 1 on rows where col1 value did change from the preceding row

But in this simpler case you can skip groupby() and do a direct assignment.

smci
  • 32,567
  • 20
  • 113
  • 146