1

My need is to get a new column created in pandas dataframe which is count and sum based on grouping. I am using method chaining like below.

df['total_sum']=df.groupby('column1')['column 2'].transform('sum') 
df['total_cnt']=df.groupby('column1')['column 2'].transform('count')

but I am getting the setting with copy warning. I am getting correct results but I want to avoid warning. I am trying workarounds but could not get one.

Gihan Chathuranga
  • 442
  • 10
  • 16
  • The part you posted here wouldn't cause the warning alone (if it does, please provide a [MCVE] first). Use `df.is_copy = None` before executing these lines. – ayhan Feb 03 '18 at 22:47

2 Answers2

0

SettingWithCopyWarning was created to flag chained assignments, which are discouraged for reasons provided in pandas documentation.

The documentation also includes the warning:

The chained assignment warnings / exceptions are aiming to inform the user of a possibly invalid assignment. There may be false positives; situations where a chained assignment is inadvertently reported.

In my view, your logic and what you expect from it is clear. To disable the warning you can use:

pd.options.mode.chained_assignment = None  # default='warn'

You may wish to disable it only for the part of your code which uses groupby + transform operations, so that useful warnings are not missed.

jpp
  • 159,742
  • 34
  • 281
  • 339
  • i understand that warning can be disabled and may be it is not impacting my result but is there any other way to address the solution by avoiding warning. – Atharva Pargaonkar Feb 03 '18 at 18:15
  • For this purpose, I would not recommend overriding the logic / checks built into `pandas` code base. This is messy and could be version dependent. If you use iPython, there are ways to [automate](https://pandas.pydata.org/pandas-docs/stable/options.html#setting-startup-options-in-python-ipython-environment) setting `pandas` options. – jpp Feb 03 '18 at 18:20
0

are you filtering rows before adding the new columns? Or otherwise slicing your dataframe?

e.g. if I run this, then I do not get the warning:

import pandas as pd   
df0 = pd.DataFrame({'column1': ['A', 'B'] * 6,
                   'column 2': range(12),
                   'column3': ['foo', 'bar', 'baz'] * 4})
df = df0
# df = df0[df0.column3.isin(['foo', 'bar'])]
# df = df0[df0.column3.isin(['foo', 'bar'])].copy()
df['total_sum'] = df.groupby('column1')['column 2'].transform('sum')
df['total_cnt'] = df.groupby('column1')['column 2'].transform('count')
print df

But if I do this, then I get the warning:

import pandas as pd   
df0 = pd.DataFrame({'column1': ['A', 'B'] * 6,
                   'column 2': range(12),
                   'column3': ['foo', 'bar', 'baz'] * 4})
# df = df0
df = df0[df0.column3.isin(['foo', 'bar'])]
# df = df0[df0.column3.isin(['foo', 'bar'])].copy()
df['total_sum'] = df.groupby('column1')['column 2'].transform('sum')
df['total_cnt'] = df.groupby('column1')['column 2'].transform('count')
print df

Which I can suppress, by explicitly copying when I am filtering the rows of the dataframe as here:

import pandas as pd   
df0 = pd.DataFrame({'column1': ['A', 'B'] * 6,
                   'column 2': range(12),
                   'column3': ['foo', 'bar', 'baz'] * 4})
# df = df0
# df = df0[df0.column3.isin(['foo', 'bar'])]
df = df0[df0.column3.isin(['foo', 'bar'])].copy()
df['total_sum'] = df.groupby('column1')['column 2'].transform('sum')
df['total_cnt'] = df.groupby('column1')['column 2'].transform('count')
print df
jondo
  • 21
  • 4