Let's say I have the following pandas DataFrame:
df = pd.DataFrame({'one': ['Baseline', 5, 6], 'two': [10, 10, 10]})
print(df)
print(df.dtypes)
# one object
# two int64
I want to collect all of the rows in which df.one != 'Baseline'
, then convert the one
column in this new dataframe to an int
datatype. I assumed the following would work fine but I'm getting a SettingWithCopyWarning
complaint when I try to cast int
to one
:
df_sub = df[df['one'] != 'Baseline']
df_sub['one'] = df_sub['one'].astype(int)
script.py:15. SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
df_sub['one'] = df_sub['one'].astype(int)
The code seems to work fine (see below) but I'd like to know how to avoid this warning (should I be using a different method, etc.). I'm following this question for changing the datatype of a specific column. I've also tried df_sub.loc[:, 'one'] = df_sub['one'].astype(int)
and df_sub.loc[:, 'one'] = df_sub.loc[:, 'one'].astype(int)
and I'm getting the same error.
print(df_sub.dtypes)
# one int64
# two int64