I prefer a pandas-only approach to this that allows you to use all the features of read.csv(). This approach envisions a situation where you may need to filter on different dates at different times, so it is worth a little overhead to create a date registry that can be saved to disk and re-used.
First, create a registry holding just the date data for your csv:
my_date_registry = pd.read_csv('data.csv', usecols=['Date'], engine='c')
(Note, in newer version of pandas, you can use engine = 'pyarrow'
, which will be faster.)
There are two ways of using this registry and the skiprows parameter to filter out the rows you don't want. You may wish to experiment as to which one is faster for your specific data.
Option 1: Build a list of integer indexes
filter_date = '2017-03-09'
my_rows = my_date_registry['Date'] == filter_date
skip_rows = ~my_rows
my_skip_indexes = my_data[skip_rows].index
my_skip_list = [x + 1 for x in my_skip_indexes]
my_selected_rows = pd.read_csv('data.csv', engine='c', skiprows=my_skip_list)
N.B. Since your data has header rows, you must add 1 to every index in my_skip_indexes
to make up for the header row.
Option 2: Create a Callable function
filter_date = '2017-03-09'
my_rows = my_data[my_data['Date'] == my_date]
my_row_indexes = my_rows.index
my_row_indexes = set([0] + [x + 1 for x in my_row_indexes])
my_filter = lambda x: x not in my_row_indexes
my_selected_rows = pd.read_csv('data.csv', engine='c', skiprows=my_filter)