Pandas - Create new column - if another column value is in list (correct way)

Question

I've been struggling with making a new column stating weekend or not based on 'Day of Week' column. I am using the following code based off a previous Stack Overflow question.

weekday_classification = {
    'Weekday': ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday'],
    'Weekend': ['Saturday', 'Sunday']
    }
weekday_classification = {day: all_days for all_days, l in weekday_classification.items() for day in l}
df["Weekend"] = df['Day of Week'].map(weekday_classification)
df.head()

Though the above code produces the desired effect - I am getting a warning which states:

ipython-input-21-e273917f31f9:6: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

What is a way to get around this, I have read the documentation which says how to make a new column, however this seems to be only for more simplistic column creations.

I'm still just dipping my toes in the sand with Python and data analysis, I'm happy to receive general feedback.

Does this answer your question? [How to deal with SettingWithCopyWarning in Pandas](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas) — Chris, Mar 01 '21 at 01:30
Does this answer your question? [How to check if a value is in the list in selection from pandas data frame?](https://stackoverflow.com/questions/18250298/how-to-check-if-a-value-is-in-the-list-in-selection-from-pandas-data-frame) — deadvoid, Mar 01 '21 at 01:30

score 1 · Accepted Answer · answered Mar 01 '21 at 02:21

Reverse your dictionary so it's like this

weekday_classification = {
                            'Monday': 'Weekday',
                            'Tuesday': 'Weekday',
                            'Wednesday': 'Weekday',
                            'Thursday': 'Weekday',
                            'Friday': 'Weekday',
                            'Saturday': 'Weekend',
                            'Sunday': 'Weekend'
                         }

then construct a new dataframe based on that weekend_classification dict to join with your existing df

In []: days = pd.DataFrame(data=weekday_classification.values(), index=weekday_classification.keys(), columns=['Weekday/end'])
       days
Out[]:
                Weekday/end
        Monday      Weekday
       Tuesday      Weekday
     Wednesday      Weekday
      Thursday      Weekday
        Friday      Weekday
      Saturday      Weekend
        Sunday      Weekend

In []: df.join(days, on=df['Day of Week'])
Out[]:
        Day of Week     Weekday/end
    0        Monday         Weekday
    1       Tuesday         Weekday
    2     Wednesday         Weekday
    3      Thursday         Weekday
    4        Friday         Weekday
    5      Saturday         Weekend
    6        Sunday         Weekend

Wow, I need to research .join more than creating the column direct. Thank you so much for this. It worked like a charm. Appreciate the help! — withayk, Mar 01 '21 at 03:32

score 1 · Answer 2 · answered Mar 01 '21 at 02:40

Because your df is a 'subset' of another DataFrame. You may have done some filtering on another DataFrame's column to generate this df like:

df = df_p[df_p['some_col'].isin(some_set)]

Pandas may simply create reference to parts of the df_p to present df, rather than actually create df. On this situation, df will be like a slice of df_p and modifying df will cause warnings because this may effect df_p. This is what the error message describes. Make sure df has its own data when df is created. Do filtering on the other DataFrame like:

df = df_p[df_p['some_col'].isin(some_set)].copy()

or use copy.deepcopy() for complicated data.

Thank you so much for this insight. You are totally right - I had created a subset higher up in my notebook. I'll keep in mind to use copy() going forward. — withayk, Mar 01 '21 at 03:32

Pandas - Create new column - if another column value is in list (correct way)

2 Answers2