I have a dataframe similar to the following one:
df = pd.DataFrame({'Text': ['Hello I would like to get only the date which is 12-13 December 2018 amid this text.', 'Ciao, what I would like to do is to keep dates, e.g. 11-14 October 2019, and remove all the rest.','Hi, SO can you help me delete everything but 10 January 2011. I found it hard doing it myself.']})
I would like to extract only dates from the text. The problem is that it is hard to find patterns. The only rule I can find there is: keep 2/3 objects before a four-digit number (i.e. the year).
I tried many convoluted solutions but I am not able to get what I need.
The result should look like this:
["12-13 December 2018"
"11-14 October 2019"
"10 January 2011"]
Can anyone help me?
Thanks!