I am extracting a data frame in pandas and want to only extract rows where the date is after a variable.
I can do this in multiple steps but would like to know if it is possible to apply all logic in one call for best practice.
Here is my code
import pandas as pd
self.min_date = "2020-05-01"
#Extract DF from URL
self.df = pd.read_html("https://webgate.ec.europa.eu/rasff-window/portal/index.cfm?event=notificationsList")[0]
#Here is where the error lies, I want to extract the columns ["Subject","Reference","Date of case"] but where the date is after min_date.
self.df = self.df.loc[["Date of case" < self.min_date], ["Subject","Reference","Date of case"]]
return(self.df)
I keep getting the error: "IndexError: Boolean index has wrong length: 1 instead of 100"
I cannot find the solution online because every answer is too specific to the scenario of the person that asked the question.
e.g. this solution only works for if you are calling one column: How to select rows from a DataFrame based on column values?
I appreciate any help.