I am working on a large batch of text strings, trying to match date times and convert them to MM-DD-YYYY format using strptime() function.
However, there are some 5-digit serial number appeared in the texts (e.g., 90481) that have mislead my .findall() function to treat them as date times. How can I avoid them by including a ^() type of condition to exclude them?
What them have in common is that they are all 5-digit, so I have tried ^(?!\d{5}) but it didn't turn out well. What's the best way to tackle this set of number?
Thank you.
Note1: I have read this post, but can't seem to get it.
Note2: about date format someone have asked in the comment section
There are many date formats in the data frame I am working on, for example:
05/10/2001; 05/10/01; 5/10/09; 6/2/01
May-10-2001; May 10, 2010; March 25, 2001; Mar. 25, 2001; Mar 25 2001;
25 Mar 2001; 25 March 2001; 25 Mar. 2001; 25 March, 2001
Mar 25th, 2001; Mar 25th, 2001; Mar 12nd, 2001
Feb 2001; Sep 2001; Oct 2001
5/2001; 11/2001
2001; 2015
So I have a rather long .findall(r' ') function, but the main point is to avoid those 5-digit serial number from be selected.
Sincerely,