1

Let's say I have the following pandas DataFrame:

df = pd.DataFrame({'one': ['Baseline', 5, 6], 'two': [10, 10, 10]})
print(df)
print(df.dtypes)
#  one    object
#  two     int64

I want to collect all of the rows in which df.one != 'Baseline', then convert the one column in this new dataframe to an int datatype. I assumed the following would work fine but I'm getting a SettingWithCopyWarning complaint when I try to cast int to one:

df_sub = df[df['one'] != 'Baseline']
df_sub['one'] = df_sub['one'].astype(int)

script.py:15. SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
   df_sub['one'] = df_sub['one'].astype(int)

The code seems to work fine (see below) but I'd like to know how to avoid this warning (should I be using a different method, etc.). I'm following this question for changing the datatype of a specific column. I've also tried df_sub.loc[:, 'one'] = df_sub['one'].astype(int) and df_sub.loc[:, 'one'] = df_sub.loc[:, 'one'].astype(int) and I'm getting the same error.

print(df_sub.dtypes)
#  one     int64
#  two     int64
Johnny Metz
  • 5,977
  • 18
  • 82
  • 146

1 Answers1

7

In order to avoid that warning, make a copy of your dataframe

df_sub = df[df['one'] != 'Baseline'].copy() # A deep copy of your dataframe otherwise it'll point to an existing one in memory.
Quentin
  • 700
  • 4
  • 10