-2

I've checked around on the forums, but can't seem to find an answer to this:

I created some test code to add a column to a dataframe based on if 'Score' is great or equal to 9:

import pandas as pd
import numpy as np
df1 = pd.DataFrame.from_items([('Score', [1,9,10]),\
 ('Response',['this is text', 'have some more text',\
                            'how about one more?'])])

df1['1 or 2'] = np.where(df1['Score'] >= 9, '1', '0')
print(df1)

output:

   Score             Response    1 or 2
0      1         this is text      0
1      9  have some more text      1
2     10  how about one more?      1

This works perfectly.

However, when i run the same type of code for my real data set which comes from a csv, it does not work. I get a strange output with either the index or the added column in a different "row" and an error message. Yet i don't get the error message with the test code?

Code:

df1 = pd.DataFrame.from_csv('C:PATH_HERE\\SOME_FILE.csv', encoding = 'ISO-8859-1')  # opens the file

df2 = df1.reset_index()  # Resets the columns to correct position

df3 = df2.dropna() # dropping the nulls
df3['1 or 2'] = np.where(df3['LIKELY_TO_RECOMMEND'] >= 9, '1', '0')

print(df3.head(n=5)) #displaying top 5 so fits on screen for Stack

output and error:

   LIKELY_TO_RECOMMEND                              VERB_REASON_FOR_SCORE  \
0                  0.0  Atendimento robotizado, nenhuma flexibilidade ...   
1                  0.0  The migration specialist, Lynette Throckmorton...   
2                  0.0           Lousy lying and no recommendable service   
3                  0.0  The new software is - not only, apparently, fu...   
4                  0.0  I feel that the portal for the garnishments is...   

  1 or 2  
0      0  
1      0  
2      0  
3      0  
4      0  

SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  df3['1 or 2'] = np.where(df3['LIKELY_TO_RECOMMEND'] >= 9, '1', '0')

Why this 'error' on the one code and not the other?

MattR
  • 4,887
  • 9
  • 40
  • 67
  • That's not a separate row, it's just wrapping in the display because the data is too wide to fit on one line. The SettingWithCopyWarning is something you can find a gazillion questions about by searching this site or googling. – BrenBarn Dec 28 '16 at 21:34
  • @BrenBarn in regards to the separate row, that's what I believed. In regards to the error. I'm unsure of why it's happening on this dataframe and not the other 'test' one. I have searched the issue and can't figure out why the error is not across both sets of code. – MattR Dec 28 '16 at 21:37
  • I can't reproduce the SettingWithCopyWarning. What pandas version are you using? – BrenBarn Dec 28 '16 at 21:43
  • version is 0.18.1. Using Conda 4.2.9 if that matters – MattR Dec 28 '16 at 21:46
  • @BrenBarn The error goes away when removing the `df.dropna()`. Any idea why this would happen? – MattR Dec 29 '16 at 14:50
  • I was wondering about. It is likely that the `dropna` is returning a DataFrame whose data is somehow linked to the original. You could try doing `df3 = d2.copy().dropna()` and see if that fixes it. It is odd though because `.dropna()` should be returning a new DataFrame as long as you don't use `inplace=True`. – BrenBarn Dec 29 '16 at 19:29
  • @BrenBarn. That is exactly what I did. See my answer below. – MattR Dec 29 '16 at 20:05

1 Answers1

0

So after some playing with the code I found that the reason why the SettingWithCopyWarning 'error' was appearing was due to using dropna()

When removing that from the code, it worked fine. Following off this post i used below code to remove the 'error' while still dropping the nulls in my DataFrame:

df3 = df2.dropna().copy()  
Community
  • 1
  • 1
MattR
  • 4,887
  • 9
  • 40
  • 67