0

I'm new to Python and have been displayed the 'SettingWithCopyWarning' I've tried to read and understand from other posts and websites, without any success. Would anybody be able to help me identify where I have chained assignment.

I think it is 'df_grouped = df1.groupby(["Menu Item"]).sum()'. If so how do I resolve the issue?

def get_max_item(startwk, endwk):
    df1 = pd.read_csv('data.csv')
    df_grouped = df1.groupby(['Item']).sum()
    df_date_extract = df_grouped.loc[:,startwk:endwk]
    df_date_extract.loc[:,'Row_Total'] = df_date_extract.sum(numeric_only=True, axis=1)
    df_max_item = df_date_extract[df_date_extract['Row_Total']==df_date_extract['Row_Total'].max()]
    return df_max_item.index[0]

I'm trying to group by item and then add a total to each row. Once I've done that I want to find the row with the largest total and return the item name as a string.

The data is structured....

Item WK1 ..... Wk 50
item1 34 ..... 45
item2 23 ..... 67
item1 12 ..... 5
item2 45 ..... 12

The error....

......:125: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy df_date_extract.loc[:,'Row_Total'] = df_date_extract.sum(numeric_only=True, axis=1)

Neil White
  • 111
  • 4
  • 18

2 Answers2

1

You can try this simplified version :

def get_max_item(startwk, endwk):
    totals = (
        pd.read_csv("data.csv", index_col=0)
            .loc[:, startwk:endwk]
            .stack() #this make a Series
            .groupby(level=0).sum()
    )
    return list(*totals.loc[[totals.idxmax()]].items())
Timeless
  • 22,580
  • 4
  • 12
  • 30
0

Solved my problem. Created a new data frame to hold grouped item totals and used idxmax() and max() to get the values I wanted. returned an array instead:

def get_max_item(startwk, endwk):
  df1 = pd.read_csv("data.csv")
  df_grouped = df1.groupby(["Item"]).sum()
  df_date_extract = df_grouped.loc[:,startwk:endwk]
  df_totals = df_date_extract.sum(numeric_only=True, axis=1)
  most_popular_item = df_totals.idxmax()
  most_popular_item_total = df_totals.max()
  return [most_popular_item,most_popular_item_total]
Neil White
  • 111
  • 4
  • 18