0

I'm struggling to figure out a way iterate through a dataframe, and use an if statement to determine whether a day is a weekday or weekend (and replicate, using the holidays package to find holidays)

Test Data

|    Date    |col1 |col2 |col3 |----| week | <- Add
|------------|-----|-----|-----|----|------|
| 2019-10-27 | 1.5 | 2.4 | 1.9 |----| wknd |
| 2019-10-28 | 2.7 | 2.1 | 1.9 |----| wkdy |
| 2019-10-29 | 1.4 | 2.7 | 1.8 |----| wkdy |

My example df has a date column, and 3 columns of data. The actual dataset has around 5 years of data. I want to add some conditional that checks to see if that date is a weekend or weekday, and adds that as a new column. I have no issues iterating and checking the dates, but I can't find anything that allows me to append or write to the dataframe on a line by line basis.

I had something like this so far:

for i in df.date:
    if i.weekday() < 5:
        df['period2'] = ('wkdy')
    else:
        df['period2'] = ('wknd')

Obviously this doesn't meet my needs, as it just sets all columns in the DF to the last iterated value.

What is the best way to iterate through and apply this logic on a line by line basis? Am I missing something simple?

Mac
  • 123
  • 9

3 Answers3

2

You can use np.where

import numpy as np
df['period2'] = np.where(df['date'].dt.weekday < 5,'wkdy','wknd')

If you really want to use a for loop you should loop over the rows by doing

for i in range(df.shape[0]):
    ....
fmarm
  • 4,209
  • 1
  • 17
  • 29
  • Worked wonderfully after adding a column of the day count with ```python for i in range(len(df.date)): df['week'][i] = (df['date'][i]).weekday() ``` Thanks! – Mac Oct 29 '19 at 22:45
0

I'd try to use the loc function and apply the functions to entire columns at once so the operation is vectorized and much quicker to execute:

df['week'] = df['date'].weekday()
df['week'].loc[df['week'] < 5] = 'wkday'
df['week'].loc[df['week'] >= 5] = 'wknd'
Wavy
  • 188
  • 7
  • Binary choices should mentally shout `np.where()` in 99% of cases (and I'm saving that 1% for cases I can't think of but possibly exist). This approach takes 2 runs at the data. – roganjosh Oct 29 '19 at 22:26
  • 1
    @roganjosh I agree, I forgot about that one for a second. – Wavy Oct 29 '19 at 22:28
0

you can use apply. it return a series or a DataFrame,

df['week'] = df['date'].apply(lambda x: "wkdy" if x.weekday() < 5 else "wked")

there are some other function can do it, like df.map, df.aggreagte, because aggreagte if basic of apply.

libin
  • 420
  • 3
  • 7