0

I am new to pandas and while doing an assignment for my course I encountered a warning which says

e:\Python.py\coursera_data\1_week3_assign.py:10: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

I want to know what exactly it means? Do I have to change my code.

Code

import pandas as pd
import numpy as np
import re
csv=pd.read_excel('Energy Indicators.xls',skiprows=17,usecols=[2,3,4,5],skipfooter=1)
csv=csv.rename(columns={'Unnamed: 2':'Country','Petajoules':'Energy Supply','Gigajoules':'Energy Supply per Capita','%':'% Renewable'})
csv['Energy Supply'].replace(['...'],[np.nan],inplace=True)
for i in range(227):
    csv['Country'][i]=re.sub(pattern='\(.*?\)',repl='',string=csv['Country'][i])
    csv['Country'][i]=re.sub(pattern='\d',repl='',string=csv['Country'][i])
csv['Energy Supply']=csv['Energy Supply']*1000000
print(csv.head()) 

Complete Warning Message

e:\Python.py\coursera_data\1_week3_assign.py:8: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  csv['Country'][i]=re.sub(pattern='\(.*?\)',repl='',string=csv['Country'][i])
e:\Python.py\coursera_data\1_week3_assign.py:9: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  csv['Country'][i]=re.sub(pattern='\d',repl='',string=csv['Country'][i])

1 Answers1

0

The warning "SettingWithCopyWarning" typically occurs when you are trying to modify a subset of a DataFrame that is a view of the original data, and pandas is warning you that the changes may not be reflected in the original DataFrame as you expect.

In your code, the warning is raised because you are modifying the 'Country' column using the indexing csv['Country'][i].

You can use the .loc accessor to modify the 'Country' column in the original DataFrame. Here's how you can do it:

csv.loc[i, 'Country'] = re.sub(pattern='\(.*?\)', repl='', string=csv.loc[i, 'Country'])
csv.loc[i, 'Country'] = re.sub(pattern='\d', repl='', string=csv.loc[i, 'Country'])
Navkar Jain
  • 195
  • 1
  • 8