1

I have a dataframe like the following,

+-----------+-------+----------+--+--+
| Date      | OPP   | Result   |  |  |
+-----------+-------+----------+--+--+
| Sat 11/16 | @DAL  | L110-102 |  |  |
+-----------+-------+----------+--+--+
| Wed 11/13 | @POR  | W114-106 |  |  |
+-----------+-------+----------+--+--+
| Mon 11/11 | @LAC  | L98-88   |  |  |
+-----------+-------+----------+--+--+
| Sun 11/10 | @LAL  | W113-104 |  |  |
+-----------+-------+----------+--+--+
| Fri 11/8  | @NO   | W122-104 |  |  |
+-----------+-------+----------+--+--+
| Wed 11/6  | vsSAC | W124-120 |  |  |
+-----------+-------+----------+--+--+
| Sat 11/2  | @MIL  | L115-105 |  |  |
+-----------+-------+----------+--+--+

I am trying to filter for dates > _____ .

Ie here is what I have tried, but does not filter all the date greater then Sun 11/10

d1 = d1[(d1['Date'] > 'Sun 11/10')]

Update

My column now looks like this, I need to be able to filter New_Date > _____ and exclude "NaT". Trying d1[(d1['New_Date'] > '2019-11-01')] but not working.

0    2019-11-20
1    2019-11-18
2    2019-11-16
3    2019-11-13
4    2019-11-11
5    2019-11-10
6    2019-11-08
7    2019-11-06
8    2019-11-02
9    2019-11-01
10   2019-10-30
11   2019-10-28
12   2019-10-26
13   2019-10-01
14          NaT
15          NaT
16   2019-10-18
17   2019-10-13
18   2019-10-10
19   2019-10-08
20          NaT
21          NaT

Any help would be appreciated.

excelguy
  • 1,574
  • 6
  • 33
  • 67
  • try convert to datetime first – ansev Nov 17 '19 at 19:05
  • Please dont post your dataframe as images. Post as dictionary so its easy for others to copy and test. You can do `df.to_dict` to get dictionary of data frame. – Poojan Nov 17 '19 at 19:05
  • Does this answer your question? [Filtering Pandas DataFrames on dates](https://stackoverflow.com/questions/22898824/filtering-pandas-dataframes-on-dates) – gosuto Nov 17 '19 at 19:55
  • slightly different based on the date format. Will need to convert the dates somehow. – excelguy Nov 17 '19 at 20:00
  • You can parse dates using pandas `to_datetime` or the datetime library. This website http://strftime.org should be helpful in figuring out which format tot use – Zephyrus Nov 17 '19 at 20:22
  • With my answer you could change the boolean to : newdf[(newdf['Date'] > '2019-11-10') & (newdf['Date'] != 'NaT')] If you want i come up with a solution that drops those before the search. Or create an apply map the calls a search function to drop them – oppressionslayer Nov 21 '19 at 04:00

3 Answers3

1

First you need to convert your date into proper datetime object, providing proper input format (which I assumed is <weekday> <month>/<day> - you can tweak it as per datetime documentation: https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior ). You can do it in a following way:

df["Date_2"]=pd.to_datetime(df["Date"].astype(str), format="%a %m/%d")

Then use analogical transformation on the filter criteria using python datetime.strptime() method:

from datetime import datetime

df.loc[df["Date_2"]>datetime.strptime("Thu 11/12", "%a %m/%d")]

So the whole example:

import pandas as pd
from datetime import datetime

df=pd.DataFrame({"Date": ["Sat 11/16", "Fri 11/8", "Wed 11/13"], "x": [4,3,7]})


df["Date_2"]=pd.to_datetime(df["Date"].astype(str), format="%a %m/%d")
print("\nExample filtered: ")
print(df.loc[df["Date_2"]>datetime.strptime("Thu 11/12", "%a %m/%d")])
print("\nThe whole thing:")
print(df)

And output:

Example filtered:
        Date  x     Date_2
0  Sat 11/16  4 1900-11-16
2  Wed 11/13  7 1900-11-13

The whole thing:
        Date  x     Date_2
0  Sat 11/16  4 1900-11-16
1   Fri 11/8  3 1900-11-08
2  Wed 11/13  7 1900-11-13

(Since no year is provided it's assumed to be 1900)

Grzegorz Skibinski
  • 12,624
  • 2
  • 11
  • 34
  • hi , ive updated my question, i was able to convert my date to a yyyy-mm-dd format, now need to filter for > then _____ . – excelguy Nov 21 '19 at 03:09
1

Are you open to changing the format to another, or having a separate column with a date that is searchable by your critieria? I wrote this solution showing that route, maybe it will help you with ideas, and i'm open to suggestions for changes:

newdf = pd.read_csv('testdata2.csv', parse_dates=["Date"], date_parser=lambda x: pd.to_datetime(x, format="%a %m/%d"), 
index_col="Date") 

newdf = newdf.reset_index()   
newdf['Date'] = newdf['Date'].mask(newdf['Date'].dt.year == 1900, newdf['Date'] + pd.offsets.DateOffset(year=2019))

newdf[(newdf['Date'] > '2019-11/10')]
#Out[63]: 
#        Date   OPP    Result
#0 2019-11-16  @DAL  L110-102
#1 2019-11-13  @POR  W114-106
#2 2019-11-11  @LAC    L98-88
oppressionslayer
  • 6,942
  • 2
  • 7
  • 24
0

From what I understand, are you trying to filter dates just to remove NaTs? Then that's not how you do it. Infact, Pandas have several functions that can check for NaTs e.g. pandas.isnull(), pandas.notna(), pandas.DataFrame.notna(), etc. :

Here's a quick example as how to use it

    >>> import pandas as pd
    >>> pd.isnull(np.datetime64('NaT'))
    True
    >>> d1[d1['New_Date'].notna()]
exan
  • 3,236
  • 5
  • 24
  • 38