51

I know there are tons of posts about this warning, but I couldn't find a solution to my situation. Here's my code:

df.loc[:, 'my_col'] = df.loc[:, 'my_col'].astype(int)
#df.loc[:, 'my_col'] = df.loc[:, 'my_col'].astype(int).copy()
#df.loc[:, 'my_col'] = df['my_col'].astype(int)

It produces the warning:

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

Even though I changed the code as suggested, I still get this warning? All I need to do is to convert the data type of one column.

**Remark: ** Originally the column is of type float having one decimal (example: 4711.0). Therefore I change it to integer (4711) and then to string ('4711') - just to remove the decimal.

Appreciate your help!

Update: The warning was a side effect on a filtering of the original data that was done just before. I was missing the DataFrame.copy(). Using the copy instead, solved the problem!

df = df[df['my_col'].notnull()].copy()
df.loc[:, 'my_col'] = df['my_col'].astype(int).astype(str)
#df['my_col'] = df['my_col'].astype(int).astype(str) # works too!
Matthias
  • 5,574
  • 8
  • 61
  • 121
  • 2
    This error is a bit confused, obviously problem is code line before `df.loc[:, 'my_col'] = df.loc[:, 'my_col'].astype(int)` – jezrael Apr 09 '18 at 08:22
  • The line before is from [my question](https://stackoverflow.com/questions/49673345/python-pandas-get-rows-of-a-dataframe-where-a-column-is-not-null) from last week: `df = df[df['my_col'].notnull()]` – Matthias Apr 09 '18 at 08:24
  • Obviously problem is with filtering, need `df = df[df['col'] > 10].copy()` – jezrael Apr 09 '18 at 08:24
  • 2
    So how working `df = df[df['my_col'].notnull()].copy()` ? – jezrael Apr 09 '18 at 08:25
  • @jezrael you're my hero of the day. That's it! – Matthias Apr 09 '18 at 08:26
  • btw, there are same df names? `df = df[df['my_col'].notnull()]` not `df = df1[df1['my_col'].notnull()]` ? – jezrael Apr 09 '18 at 08:31
  • It's inside a function. The original df is the function argument and the conversion is done inside the function. But you're right, I should avoid using the same name. – Matthias Apr 09 '18 at 08:33
  • ?? I don't get what you're trying to say. I'm loading data, starting to filter and convert and apply some logic. For this specific warning, adding the copy() solved it. – Matthias Apr 09 '18 at 08:38
  • No problem, good luck! Not sure if I know it explain nice :) – jezrael Apr 09 '18 at 08:39

3 Answers3

46

I think need copy and omit loc for select columns:

df = df[df['my_col'].notnull()].copy()
df['my_col'] = df['my_col'].astype(int).astype(str)

Explanation:

If you modify values in df later you will find that the modifications do not propagate back to the original data (df), and that Pandas does warning.

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
36

another way is to disable chained assignments, which works on your code without the need to create a copy:

# disable chained assignments
pd.options.mode.chained_assignment = None 
sudonym
  • 3,788
  • 4
  • 36
  • 61
8

If you need to change the data type of a single column, it's easier to address that column directly:

df['my_col'] = df['my_col'].astype(int)

Or using .assign:

df = df.assign(my_col=lambda d: d['my_col'].astype(int))

The .assign is useful if you only need the conversion once, and don't want to alter your df outside of that scope.

Guybrush
  • 2,680
  • 1
  • 10
  • 17
  • 5
    Nope, I got the same warning when using your first line of code. See commented line in my original posting. The second one is not usable since the name of the column is a static variable that can not be used in an assign statement. – Matthias Apr 09 '18 at 08:25
  • 1
    Sorry, you are right. First line works. The problem for my warning was a filtering of the original data. Updated my posting! – Matthias Apr 09 '18 at 08:29
  • 1
    You can use a dynamic variable name in an `assign` statement by passing a dictionary and using ** on it: `df.assign(**{x: 1})` for example, assuming the content of `x` is a valid Python identifier. – Guybrush Apr 09 '18 at 08:35
  • "Sorry, you are right. First line works. The problem for my warning was a filtering of the original data." That is the most important information here. – Ereghard May 11 '22 at 13:11