Let's say I have a particular date like January 10, 2013
.
I'd like to be able to search a text or html document to see if it contains a reference to that date. I'd like to account for the date being in any of a number of formats, for instance:
1/10/2013
01/10/13
2013-01-10
10-Jan-2013
January 10, 2013
Jan 10, 2013
... should all produce a (+) matching result for January 10, 2013
.
I recognized that swapping the day-month order could be problematic, but I would be willing to accept a false positive result in this case, meaning:
01-10-2013
10-01-2013
... would both be acceptable for January 10, 2013
in my case.
Is there an established algorithm implemented in any language that performs this sort of generalized, but non-trivial, search? My preference would be something in Ruby or JavaScript, but I'd be interested in any well considered example. ADDENDUM #1
I see this code:
def validate_date(date_str)
valid_formats = ["%m/%d/%Y", "%m/%d/%Y %I:%M %P"]
#see http://www.ruby-doc.org/core-1.9.3/Time.html#method-i-strftime for more
valid_formats.each do |format|
valid = Time.strptime(date_str, format) rescue false
return true if valid
end
return false
end
here.
... which would be good way to handle numerical representation of dates. This leaves Month names unaccounted for. With 1, 01, Jan, and January all representing the first month of the year, I am wondering if the large number of permutations has been well handled somewhere else.