2

I am countering a strange error , this code was working earlier (earlier runs of code, few hours back) but now it isn't .

import numpy as np   
import pandas as pd 
​df = pd.read_csv('nlp_monta.csv') 
df['Text 2'] = pd.Series(map(lambda x: str(x).replace("^"," "), df['Text']))
​i=0;
for row in df['Text 2']:
    df.iloc[i]['Text 2'] = set(row.split())    # This isn't giving unique words 
    i=i+1                                      #earlier it was 

The warning, though the code is working - Image of results

C:\Users\ishanna\AppData\Local\Continuum\anaconda3\lib\site->packages\ipykernel_launcher.py:2: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas->docs/stable/indexing.html#indexing-view-versus-copy

Ishan
  • 31
  • 3
  • refer https://stackoverflow.com/questions/34962104/pandas-how-can-i-use-the-apply-function-for-a-single-column – Shijith May 08 '19 at 08:49
  • Please specify your question. Also, what does "earlier" mean in your case? – michcio1234 May 08 '19 at 08:54
  • @Shijith : Problem is at split() not at lambda function – Ishan May 08 '19 at 08:55
  • The warning you're seeing tells you that `df.iloc[i]['Text 2'] = set(row.split())` may not be actually modifying your `df`. `df.iloc[i]` returns a *view* of your dataframe and the rest of the line is modifying this view (instead of original dataframe). – michcio1234 May 08 '19 at 08:55
  • @michcio1234 - 'earlier, means earlier runs of the code , editing the Question for same – Ishan May 08 '19 at 08:56
  • @michcio1234 - Yeah , that seems to be the case , what shall i use instead of iloc than ? – Ishan May 08 '19 at 08:58

1 Answers1

0

As discussed in the comments, the problem seems to lie in df.iloc[i]['Text 2'] = set(row.split()).

SettingWithCopyWarning tells you that df.iloc[i] may return a view of your dataframe and the rest of the line is modifying this view (instead of original dataframe).

Iterating through rows is rarely a good idea. Instead, you can try another map (I didn't test it though):

df['Text 2'] = df['Text 2'].str.split().map(set)

Read about string accessor here.

michcio1234
  • 1,700
  • 13
  • 18
  • Thanks , this helped me a lot , have upvoted your answer but it won't show not as my reputation is less than 15 – Ishan May 15 '19 at 12:12
  • @Ishan I'm glad I could help! You can mark the answer as accepted if it solved your problem. – michcio1234 May 15 '19 at 13:26