I'm new at Python but have to make a regex to pick up dates in format dd-mm-yyyy form text. I wrote something like this:
format1 = re.findall('[0-2][0-9]-02-(\d){4}|(([0-2][0-9]|30)-(04|06|09|11)-(\d){4})|(([0-2][0-9]|30|31)-(01|03|05|07|08|10|12)-(\d){4})',article)
It also checks if date format is correct. I checked if it works at pythex.org I returns the right dates but unfortunately also some empty matches and random numbers:
Match 1
1. None
2. None
3. None
4. None
5. None
6. 21-10-2005
7. 21
8. 10
9. 5
Match 2
1. None
2. None
3. None
4. None
5. None
6. 31-12-1993
7. 31
8. 12
9. 3
How can I improve the regex to return only dates or drop everything that isn't a date?