0

I need to test a specific array of dates to ensure that they are in the correct format, however I cannot use 'parse' because, if I do, the dates that are incorrect are sorted out. For instance, if I have an incorrect date with a month "13", it adds another year and sets the month to 1.

My code pulls in the dates from an SQL query:

table_birth_dates = self.class.connection.execute("SELECT birth_date FROM #{temp_table_name}").values.flatten 

[
    [0] "1980-30-54",
    [1] "1980-30-54",
    [2] "2020-09-10",
    [3] "1890-10-30"
]
yr = 1900  
year_test = table_birth_dates.select{|d| Date.parse(d).year < yr}

This now gives me an ArgumentError: invalid date.

I thought of using:

splitted_birth_date = table_birth_dates.first.split("-")
splitted_birth_date.first.to_i > 1900?

but if I try to loop through all of my dates, I'm not able to manipulate anything via splitting:

table_birth_dates.each do |birth_date|
  birth_date.split("-")
end

What can I do with this?

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
kdweber89
  • 1,984
  • 2
  • 19
  • 29

2 Answers2

0

I need to test a specific array of dates to ensure that they are in the correct format...

If you get an error it means that the date is incorrect, you could rescue that error and do anything you want with that date to make it valid or whatever.

table_birth_dates.each do |birth_date|
  begin
    if Date.parse(d).year < yr
      # do smth
    end
  rescue ArgumentError => e
   # do smth with `d`
  end
end
Adam Zapaśnik
  • 633
  • 4
  • 9
  • "If you get an error it means that the date is incorrect". Not necessarily. It could also mean the date parser picked the wrong format. There are plenty of ambiguous dates that appear to be the right format, but the values don't fit. That's why we don't use `parse` unless we're sure every date will make sense to it. If we can't guarantee that then we have to use other tests to try to determine the format, and reject any dates we can't figure out. `parse` is convenient, but it's also slow and not always right and having it silently return an invalid parsed date is hard to track. – the Tin Man Apr 10 '17 at 19:18
  • @theTinMan I see. Would it be smarter to test every date against the regexp and then do whatever they want with it? For instance: `(?\d{4})-(?\d{2})-(?\d{2})` Or just use `#split` like they tried to... – Adam Zapaśnik Apr 10 '17 at 19:48
  • Neither approach is capable of telling you whether a field is a month or a day. You can fairly safely assume that a date in year, month, day format is going to be parsable, but the delimiters could vary. And, unfortunately, the internet is chock-full of "programmers" who ignore, or are ignorant of, the date standards, so you have to defensively check the source. If you can trust them you can write more lenient code. If you can't then be defensive and expect failures in those cases that you didn't foresee. You could rescan the dates looking for outliers to determine the right fit... – the Tin Man Apr 10 '17 at 21:14
  • ...but if that particular feed didn't have any unambiguous dates then your code still won't know. Sometimes you just can't be sure and have to put on your detective hat and learn more about the source. See https://en.wikipedia.org/wiki/Calendar_date#List_of_the_world_locations_by_date_format_in_use for more info. http://stackoverflow.com/q/2955830/128421 is a good read too. – the Tin Man Apr 10 '17 at 21:23
0

You could combine your select and split approaches together:

table_birth_dates.select { |d| d.split('-').first.to_i < 1900 }
#=> ["1890-10-30"]
gwcodes
  • 5,632
  • 1
  • 11
  • 20