-1

Is there an easy way to isolate a given interval of dates that are in YYYY-MM-DD format in a data frame? Like only including dates between 2005-2014 and removing the rest from the data frame

is there a way to integrate regex into this or is it way too hard?

Corralien
  • 109,409
  • 8
  • 28
  • 52
Kevin
  • 1
  • 1
    Does this answer your question? [Filtering Pandas DataFrames on dates](https://stackoverflow.com/questions/22898824/filtering-pandas-dataframes-on-dates) – shoaib30 Jul 27 '21 at 07:05
  • Dates have no format, they're binary values. Formats apply only when formatting dates to strings or parsing stings into dates. If you load strings instead of dates, you need to convert them to dates before filtering the data. Once you do that, filtering is as easy as any other kind of data – Panagiotis Kanavos Jul 27 '21 at 07:06

2 Answers2

0

Yes it's possible:

>>> df
         date
0  2000-12-31
1  2001-12-31
2  2002-12-31
3  2003-12-31
4  2004-12-31
5  2005-12-31
6  2006-12-31
7  2007-12-31
8  2008-12-31
9  2009-12-31
10 2010-12-31
11 2011-12-31
12 2012-12-31
13 2013-12-31
14 2014-12-31
15 2015-12-31
16 2016-12-31
17 2017-12-31
18 2018-12-31
19 2019-12-31
20 2020-12-31
>>> df[(df['date'].dt.year >= 2005) & (df['date'].dt.year <= 2014)]
         date
5  2005-12-31
6  2006-12-31
7  2007-12-31
8  2008-12-31
9  2009-12-31
10 2010-12-31
11 2011-12-31
12 2012-12-31
13 2013-12-31
14 2014-12-31

Or string version:

>>> df[(df['date'] >= '2004') & (df['date'] <= '2014')]

Or:

>>> df[df['date'].between('2004', '2014')]
Corralien
  • 109,409
  • 8
  • 28
  • 52
0

You can use the between operator which works on dates as well as on numbers and strings. See the following example -

df = pd.DataFrame({"A": [datetime(2020, 1,1), datetime(2019,1,1), datetime(2018,1,1)], 'B': ['2020-01-01', '2019-01-01', '2018-01-01']})
df[df['B'].between('2018-06-01', '2021-01-01')]
           A           B
0 2020-01-01  2020-01-01
1 2019-01-01  2019-01-01

df[df['A'].between(datetime(2018,6,1), datetime(2021,1,1))]
           A           B
0 2020-01-01  2020-01-01
1 2019-01-01  2019-01-01
Tom Ron
  • 5,906
  • 3
  • 22
  • 38