1

Let say I have a data frame with numerical values, like below

df = pd.DataFrame(np.random.randn(10, 4), columns=list('ABCD'))

    A            B           C           D
0   1.148867    -2.332343   -0.168327   -0.001228
1   -0.575731   0.905931    -0.722896   -0.316320
2   1.487290    0.797067    0.485837    -0.111441
3   -1.176389   -0.734691   -0.928221   -0.163423
4   1.866434    1.390055    -0.686367   -1.608775
5   -0.148878   0.459058    2.147155    2.256669
6   -0.413589   0.261600    1.565556    -1.587567
7   -0.924501   -0.712473   -0.422530   -2.729229
8   -0.309152   -0.094097   -1.216532   -2.607139
9   0.069348    0.288499    0.801205    0.162862

and then I create a boolean data frame

dfBool = df > 1

    A       B       C       D
0   True    False   False   False
1   False   False   False   False
2   True    False   False   False
3   False   False   False   False
4   True    True    False   False
5   False   False   True    True
6   False   False   True    False
7   False   False   False   False
8   False   False   False   False
9   False   False   False   False

Now I would like to use the second data frame to replace all values, which are bigger than one, with zeros in the first data frame. So, I just did the following:

df[dfBool] = 0

It works, but I am getting a warning.

<ipython-input-246-322bdc88170a>:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  dataTest[outliers] = 0
c:\users\...\programs\python\python38\lib\site-packages\pandas\core\frame.py:2986: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._where(-key, value, inplace=True)

What am I missing here?

yannk
  • 51
  • 6
  • 1
    Your code didn't generate the 'SettingWithCopyWarning' for me. Take a look at this [post](https://stackoverflow.com/q/20625582/6361531) to explain that warning messages. Note it is a warning and not an error. – Scott Boston Jul 10 '20 at 21:14

1 Answers1

2

IIUC,

Try:

df.mask(df > 1, 0)

Output:

          A         B         C         D
0  0.000000 -2.332343 -0.168327 -0.001228
1 -0.575731  0.905931 -0.722896 -0.316320
2  0.000000  0.797067  0.485837 -0.111441
3 -1.176389 -0.734691 -0.928221 -0.163423
4  0.000000  0.000000 -0.686367 -1.608775
5 -0.148878  0.459058  0.000000  0.000000
6 -0.413589  0.261600  0.000000 -1.587567
7 -0.924501 -0.712473 -0.422530 -2.729229
8 -0.309152 -0.094097 -1.216532 -2.607139
9  0.069348  0.288499  0.801205  0.162862
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
  • 2
    @ggorlen true. However, using the above gives me the correct result without the warning. Thank you – yannk Jul 10 '20 at 21:13