0

I couldn't find an answer to this in the existing SettingWithCopy warning questions, because the common .loc solution doesn't seem to apply. I'm loading a table into pandas then trying to create some mask columns based on values in the other columns. For some reason, this returns a SettingWithCopy warning even when I'm wrapping the test in a pd.Series constructor.

Here's the relevant code. The output at the end seems to be right, but does anyone know what would be causing this?

all_invs = pd.read_table('all_quads.inv.bed', index_col=False,
                         header=None, names=clustered_names)

invs = all_invs[all_invs['uniqueIDs'].str.contains('p1')]
samples = [line.strip() for line in open('success_samples.list')]

for sample in samples:
    invs[sample] = invs['uniqueIDs'].str.contains(sample)

It happens with another boolean test as well.

invs["%s_private_denovo" % proband] = pd.Series(
    invs[proband] & ~invs[father] & ~invs[mother] &
    invs["%s_private" % proband])

Thanks!

Matt Stone
  • 185
  • 1
  • 1
  • 8
  • 1
    possible duplicate of [Setting values on a copy of a slice from a DataFrame](http://stackoverflow.com/questions/31468176/setting-values-on-a-copy-of-a-slice-from-a-dataframe) – firelynx Jul 30 '15 at 08:03

2 Answers2

1

I guess invs causes the warning. To resolve that, copy it explicitly like this:

invs = all_invs[all_invs['uniqueIDs'].str.contains('p1')].copy()
Jihun
  • 1,415
  • 1
  • 12
  • 16
0

This is a copy of the selected answer from this post.

This warning comes because your dataframe x is a copy of a slice. This is not easy to know why, but it has something to do with how you have come to the current state of it.

You can either create a proper dataframe out of x by doing

x = x.copy()

This will remove the warning, but it is not the proper way!

You should be using the DataFrame.loc method, as the warning suggests, like this:

x.loc[:,'Mass32s'] = pandas.rolling_mean(x.Mass32, 5).shift(-2)
Aleksandar
  • 3,558
  • 1
  • 39
  • 42