How to choose date interval from pandas dataframe column?

Question

How to create a new dataframe based on date interval from an existing dataframe:

df=pd.DataFrame([["01.01.18",32],
    ["01.01.18",8],
    ["01.01.18",12],
    ["01.01.18",77],
    ["02.01.18",99],
    ["03.01.18",78],
    ["04.01.18",89],
    ["02.02.18",85],
    ["10.03.18",35],
    ["13.04.18",81],
    ["03.02.18",97],
    ["29.03.18",90],
    ["08.04.18",7]],columns=["date","payment"])

How do I create a dataframe with date values between 01.01.18 and 31.01.18, so the new df would look like:

Date Payment 
01.01.18,   32
01.01.18,   8
01.01.18,   12
01.01.18,   77
02.01.18,   99
03.01.18,   78
04.01.18,   89

What would you want as `payment` for days which are not in your existing dataframe? It might help if you can show a sample of your expected output. — jpp, May 30 '18 at 17:03
You can look at the marked duplicate. Note that `.ix` has been depreciated, so use `.loc`. — jpp, May 30 '18 at 17:09
@jpp well, every single answer in that question has .ix solution (quite confusing since it was depreciated), may be it worth updating the question or create another one, like this. — user40, May 30 '18 at 17:15
@user40, When I have some time, I'll go through them and stick a banner on top :). But for now, I always make sure when I mark as I duplicate I add a comment so users don't get confused. I've also added another recent duplicate. — jpp, May 30 '18 at 17:16

harvpan · Answer 1 · 2018-05-30T17:35:37.737

0

You need:

df.set_index(pd.to_datetime(df['date'])).loc['2018-01-01':'2018-01-31'].reset_index(drop=True)

Output:

        date    payment
0   01.01.18    32
1   01.01.18    8
2   01.01.18    12
3   01.01.18    77

You can keep using your original dateformat as well with below code and achive same output:

import datetime as dt
start = dt.datetime.strptime('01.01.18', '%d.%m.%y').strftime('%Y-%m-%d')
end = dt.datetime.strptime('31.01.18', '%d.%m.%y').strftime('%Y-%m-%d')
df.set_index(pd.to_datetime(df['date'])).loc[start:end].reset_index(drop=True)

edited May 30 '18 at 17:35

answered May 30 '18 at 17:07

harvpan

8,571
2
18
36

Thanks, why the output doesn't have values for 02.01.18, 03.01.18, 04.01.18 dates? – user40 May 30 '18 at 17:10
@user40 that's because of `pd.to_datetime()` – harvpan May 30 '18 at 17:12
so how do I include those values as well? They are within the range '2018-01-01':'2018-01-31' – user40 May 30 '18 at 17:17
@user40, good point. See the edit. – harvpan May 30 '18 at 17:23
what is dt ? thanks – user40 May 30 '18 at 17:34
It's the alias for `datetime` library. I have included the `import` statement. I hope that clears out your doubt. – harvpan May 30 '18 at 17:37
I get 'KeyError: 'date'' error – user40 May 30 '18 at 17:44
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/172100/discussion-between-harv-ipan-and-user40). – harvpan May 30 '18 at 17:46

How to choose date interval from pandas dataframe column?

1 Answers1