2

I have a pd like below:

"Date" is the index, not column

Date Value
2020-01-01 1
2020-01-02 2
2020-01-03 3
2020-01-04 4
2020-01-05 5

and I want to do something like if date > 2020-01-02 and date < 2020-01-05

then Value = 0

Date Value
2020-01-01 1
2020-01-02 2
2020-01-03 0
2020-01-04 0
2020-01-05 5

.loc can only trim the whole data set and I don't want that. May I know what can I do to achieve this? Thanks

  • What is pd and what is .loc? What have you done so far? – cup Sep 13 '21 at 05:53
  • 1
    Does this answer your question? [Select DataFrame rows between two dates](https://stackoverflow.com/questions/29370057/select-dataframe-rows-between-two-dates) – Tzane Sep 13 '21 at 05:58

2 Answers2

1

In case it could be useful, here's an option using df.loc:

df['Date'] = pd.to_datetime(df['Date'])
df['Value'].loc[(df['Date'] > "2020-01-02") & (df['Date'] < "2020-01-05")] = 0
marsolmos
  • 748
  • 9
  • 24
1

If you want if...else Try this:

import pandas as pd

df = pd.DataFrame({
    'Date': ['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04', '2020-01-05'],
    'Value': [1,2,3,4,5]})

df['Value'] = [
    0 if (x > '2020-01-02') & (x <= '2020-01-04') else  df['Value'].iloc[idx]
    for idx, x in enumerate(df['Date'])
]
print(df)

Output:

         Date    Value
0  2020-01-01      1
1  2020-01-02      2
2  2020-01-03      0
3  2020-01-04      0
4  2020-01-05      5

Or you can use .loc like below:

import pandas as pd

df = pd.DataFrame({
    'Date': ['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04', '2020-01-05'],
    'Value': [1,2,3,4,5]})

df['Date'] = pd.to_datetime(df['Date'])  
df.loc[(df['Date'] > '2020-01-02') & (df['Date'] <= '2020-01-04'), 'Value'] = 0

Edit Base On Your Comment (if Date is index):

import pandas as pd

df = pd.DataFrame({
    'Date': ['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04', '2020-01-05'],
    'Value': [1,2,3,4,5]})

df.set_index('Date', inplace=True)
df

Output:

enter image description here

For the above DataFrame you can use df.reset_index() and like below:

df = df.reset_index()
df['Date'] = pd.to_datetime(df['Date'])  
df.loc[(df['Date'] > '2020-01-02') & (df['Date'] <= '2020-01-04'), 'Value'] = 0

With if...else:

df = df.reset_index()
df['Value'] = [
    0 if (x > '2020-01-02') & (x <= '2020-01-04') else  df['Value'].iloc[idx]
    for idx, x in enumerate(df['Date'])
]
I'mahdi
  • 23,382
  • 5
  • 22
  • 30
  • Hi, sorry the "Date" is the index, not the column. so I think I cannot use the method as you. – user3239756 Sep 13 '21 at 14:37
  • @user3239756 edited answer , for this problem you can use `df.reset_index()` – I'mahdi Sep 13 '21 at 15:55
  • Thanks I will try to use reset_index(), however, can I use if statement with it? I actually want to do a series of functions/validation more than just put a value. – user3239756 Sep 13 '21 at 18:38
  • @user3239756 edited answer again and add part `if..else` in the end of answer – I'mahdi Sep 14 '21 at 02:49
  • Thanks, I got this error : '>' not supported between instances of 'Timestamp' and 'str' but I think the reset index is the first start – user3239756 Sep 16 '21 at 19:14
  • @user3239756 if you write :`for idx, x in enumerate(df['Date']) : if (x > '2020-01-02')` don't write `df['Date'] = pd.to_datetime(df['Date'])` because in `if` you check `string` with `df['Date']` and you need data in `df['Date']` will be `string` – I'mahdi Sep 17 '21 at 02:43