Try PL/SQL instead of a regular expression. It will be significantly slower, but will be safer and easier to maintain and extend.
You should rely on the Oracle format models to do this correctly. I've seen lots of attempts to validate this information using a regular expression, but
I rarely see it done correctly.
If you really care about performance, the real answer is to fix your data model.
Code and Test Cases:
--Function to convert a string to a date, or return null if the format is wrong.
create or replace function validate_date(p_string in string) return date is
begin
return to_date(p_string, 'MONTH DD, YYYY');
exception when others then
begin
return to_date(p_string, 'MM/DD/YYYY');
exception when others then
begin
return to_date(p_string, 'DD-MON-RR');
exception when others then
return null;
end;
end;
end;
/
--Test individual values
select validate_date('JULY 31, 2009') from dual;
2009-07-31
select validate_date('7/31/2009') from dual;
2009-07-31
select validate_date('31-JUL-09') from dual;
2009-07-31
select validate_date('2009-07-31') from dual;
<null>
Simple Performance Test:
--Create table to hold test data
create table test1(a_date varchar2(1000)) nologging;
--Insert 10 million rows
begin
for i in 1 .. 100 loop
insert /*+ append */ into test1
select to_char(sysdate+level, 'MM/DD/YYYY') from dual connect by level <= 100000;
commit;
end loop;
end;
/
--"Warm up" the database, run this a few times, see how long a count takes.
--Best case time to count: 2.3 seconds
select count(*) from test1;
--How long does it take to convert all those strings?
--6 minutes... ouch
select count(*)
from test1
where validate_date(a_date) is not null;