Python Pandas sum a constant value in Columns If date between 2 dates

Question

Let's assume a dataframe using datetimes as index, where we have a column named 'Score', initialy set to 10:

            score
2016-01-01  10
2016-01-02  10
2016-01-03  10
2016-01-04  10
2016-01-05  10
2016-01-06  10
2016-01-07  10
2016-01-08  10

I want to substract a fixed value (let's say 1) from the score, but only when the index is between certain dates (for example between the 3rd and the 6th):

            score
2016-01-01  10
2016-01-02  10
2016-01-03  9
2016-01-04  9
2016-01-05  9
2016-01-06  9
2016-01-07  10
2016-01-08  10

Since my real dataframe is big, and I will be doing this for different dateranges and different fixed values N for each one of them, I'd like to achieve this without requiring to create a new column set to -N for each case.

Something like numpy's where function, but for a certain range, and allowing me to sum/substract to current value if the condition is met, and do nothing otherwise. Is there something like that?

check this answer https://stackoverflow.com/questions/29370057/select-dataframe-rows-between-two-dates and also check https://pandas.pydata.org/pandas-docs/version/0.20.3/generated/pandas.DataFrame.between_time.html — Yuca, Aug 21 '19 at 12:19
@Yuca the problem is not getting the range, but editing the dataframe only within that range — Roman Rdgz, Aug 21 '19 at 12:28
there's no distinction in what you're saying. Editing the dataframe requires getting the range, so what's the problem then? — Yuca, Aug 21 '19 at 12:34

score 5 · Accepted Answer · answered Aug 21 '19 at 12:36

5

Use index slicing:

df.loc['2016-01-03':'2016-01-06', 'score'] -= 1

answered Aug 21 '19 at 12:36

Code Different

90,614
16
144
163

manwithfewneeds · Answer 2 · 2019-08-21T12:39:14.177

2

Assuming dates are datetime dtype:

#if date is own column:    
df.loc[df['date'].dt.day.between(3,6), 'score'] = df['score'] - 1

#if date is index:    
df.loc[df.index.day.isin(range(3,7)), 'score'] = df['score'] - 1

edited Aug 21 '19 at 12:39

answered Aug 21 '19 at 12:33

manwithfewneeds

1,137
1
7
10

Nathan Furnal · Answer 3 · 2019-08-21T13:02:45.730

I would do something like that using query :

import pandas as pd

df = pd.DataFrame({"score":pd.np.random.randint(1,10,100)}, 
    index=pd.date_range(start="2018-01-01", periods=100))

start = "2018-01-05"
stop = "2018-04-08"

df.query('@start <= index <= @stop ') - 1

Edit : note that something using eval which goes to boolean, can be used but in a different manner because pandas where acts on the False values.

df.where(~df.eval('@start <= index <=  @stop '), 
         df['score'] - 1, axis=0, inplace=True)

See how I inverted the comparison operators (with ~), in order to get what I wanted. It's efficient but not really clear. Of course, you can also use pd.np.where and all is good in the world.

Python Pandas sum a constant value in Columns If date between 2 dates

3 Answers3