0

I am getting a warning on the code below. It seems to relate to both the line where I'm inserting a new column and also to the loop. I have seen other posts relating to this error but unfortunately I'm a complete novice and can't see what the issue is with mine. The warning specifically says "Try using .loc... instead", but this is what I'm doing so I don't know what the problem is.

I'm using pycharm and a snip of the dataframe before the Column addition and loop is below.

Any help would be appreciated Thanks

dataframe snip

    import numpy as np
    import pandas as pd
    
    gdp_data = pd.read_csv("GDP Hist.csv")
    
    # data has 2 entries per year for: total in millions & GDP per person
    # removing duplicates based on years to leave only total GDP
    gdp_data.drop_duplicates(subset=["LOCATION", "TIME"], inplace=True)
    
    
    # Create list of unneeded columns & remove
    unneeded_cols = ["INDICATOR", "SUBJECT", "MEASURE", "FREQUENCY", "Flag Codes"]
    gdp_data.drop(columns=unneeded_cols, axis=1, inplace=True)
    # print(gdp_data.info())
    
    # Subset for Ireland GDP
    gdp_ire = gdp_data[gdp_data['LOCATION'] == "IRL"]
    gdp_ire.set_index('TIME', inplace=True)
    gdp_ire['Annual%'] = np.nan       # insert blank column
            
    
    # loop through dataframe & calc annual % growth
    for i in gdp_ire.index:
        if i == 1970:
            gdp_ire.loc[i, 'Annual%'] = ""
        else:
            gdp_ire.loc[i, 'Annual%'] = (gdp_ire.loc[i, 'Value']-gdp_ire.loc[i-1, 'Value'])/gdp_ire.loc[i-1, 'Value']*100
    
    print(gdp_ire)
xceej
  • 159
  • 1
  • 1
  • 7
  • 3
    Does this answer your question? [How to deal with SettingWithCopyWarning in Pandas](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas) – Corralien Apr 22 '21 at 17:42

1 Answers1

1

Indeed in this line:

gdp_ire = gdp_data[gdp_data['LOCATION'] == "IRL"]

you are selecting a portion of the global dataframe and in the line below, you are modifying this subset.

One simple fix could be:

gdp_ire = gdp_data[gdp_data['LOCATION'] == "IRL"].copy()
Sebastien D
  • 4,369
  • 4
  • 18
  • 46