4

I have the following code in a function to check if string 'datestr' is in the correct format (dd/mm/yyyy):

if (sscanf(datestr, "%d/%d/%d", &day, &month, &year) != 3) return NULL;

While it works with a correct formatted string like "02/10/2015" it also works with a string like "2/10/2015" which is not correct formatted as day and month must be 2 digits long each and year 4 digits long. Is there a way I can check this within the sscanf function? Or do I have to check it before with an if condition like the following?

if (!(strlen(datestr) == 10 && isdigit(datestr[0]) && isdigit(datestr[1]) && ...)) return NULL;

Thank you!

Kaladin11
  • 195
  • 1
  • 2
  • 7
  • 1
    or use a regex, such as [PCRE](http://www.pcre.org/) – t0mm13b Oct 23 '15 at 23:33
  • 1
    Substantially, there isn't going to be a way to get `sscanf()` to do the checking you want. You'll have to do it yourself. Of course, `sscanf()` will allow `"99/ 7/-201"` through; you have to validate that the numbers are in the desired range anyway. – Jonathan Leffler Oct 23 '15 at 23:41
  • What makes you believe that "dd/mm/yyyy" is "the correct format"? – Mike Nakis Oct 23 '15 at 23:43
  • 2
    America uses mm/dd/yyyy, Japan uses yyyy-mm-dd, europe (not all countries!) uses dd/mm/yyyy. Be better to stick to the ISO 8601 format for compatibility. – t0mm13b Oct 23 '15 at 23:47
  • Well, it is part of an assignment for university and I know that the dates will be in that format :) – Kaladin11 Oct 23 '15 at 23:53
  • Using [`strptime()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/strptime.html) isn't going to be sufficient help either. Most of the Unix tools allow for flexible input formats; you get to control the output format. Using `strptime()` would ensure that the various fields are within their orthodox ranges; getting good error reporting out will be very hard, though. – Jonathan Leffler Oct 24 '15 at 00:03
  • Regarding format: in the 'real world', you have to deal (somehow) with the vagaries of the different formats. However, for this issue, dealing with one format is sufficient to show the problems, and generalizing to handle multiple formats need not be very hard. – Jonathan Leffler Oct 24 '15 at 00:05

3 Answers3

2

To do a pedantic check with sscanf(), use "%[]" and "%n".

// if (sscanf(datestr, "%d/%d/%d", &day, &month, &year) != 3) return NULL;
int n[3] = { 0 };
sscanf(datestr, "%*[0-9]%n/%*[0-9]%n/%*[0-9]%n", &n[0], &n[1], &n[2]); 
if (n[0] != 2 || n[1] != 5 || n[2] != 10) return NULL;

// Good To Go
sscanf(datestr, "%d/%d/%d", &day, &month, &year);

if (!ValidDate(year, month, day)) return NULL;

Lots of various tests for dates, Modern dates are easy. Allowing historic dates is tricky. How about Feb 30, 1712?

Let code use dates understood by the computer

int ValidDate(int year, int month, int day) {
  struct tm tm1 = { 0 };
  tm1.tm_year = year - 1900;
  tm1.tm_mon = month + 1;
  tm1.tm_mday = day;
  struct tm tm2 = tm1;
  if (mktime(&tm1) == -1) return 0; // failed conversion.
  // Did mktime() adjust fields?
  if (tm1.tm_year != tm2.tm_year) return 0;
  if (tm1.tm_mon != tm2.tm_mon) return 0;
  return tm1.tm_mday == tm2.tm_mday;
}
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
0

I don't believe you can do what you are asking for inside of sscanf. The format specifiers specify a maximum width, but not a minimum width. So for example "%2d/%2d/%4d" will disallow "100/10/2014" but will not disallow "1/10/2015".

As you mentioned in your post, you can do a digit count. That would eliminate many spurious answers. If you wanted a single statement and didn't want to do bounds checking (i.e. check to make sure the data is a valid day - so the month isn't 32 or something like that) you could use a regular expression. That would be using something like std::regex_match (C++11 or later). You can find more information on the right regular expression here.

Otherwise you are stuck doing your own checking after the parse.

  • Edit I was unclear on something above. There is no standard regex library for C that I am aware of, but there are several available. When I pointed to std::regex_match from C++, it was more to give you an example of what the regular library function you would need are. There are usually available regex libraries (like posix regex) but they don't tend to be portable.
Community
  • 1
  • 1
  • 1
    Why are you mentioning C++ when the OP's question is clearly tagged as C and the code is same? – t0mm13b Oct 23 '15 at 23:49
  • Sorry - I'll edit to be more clear. I meant to point out originally that he would need a different library, but could see std::regex_match for an example of what he would need. There are available libraries like posix regex that would work. – Samuel Adam Blake Oct 23 '15 at 23:51
  • [POSIX](http://stackoverflow.com/questions/1780599/i-never-really-understood-what-is-posix) is more of a standard definition rather than a library. While the standard is designed to be portable, implementations are not always available on various operating systems and certainly do not come installed by default. On a linux system, he can probably assume that posix regex is available. I would not make that assumption on a Windows box. – Samuel Adam Blake Oct 24 '15 at 00:00
  • To quote from the top-marked answer "you can be pretty sure to be able to port them easily among a large family of Unix derivatives (including Linux, but not limited to it!); if and when you use some Linux API that's not standardized as part of Posix" I have not assumed any platform either. – t0mm13b Oct 24 '15 at 00:05
0

As suggested in the comments, using a regular expression may be your best bet, although you could use if conditions, but that would be less concise.

I don't code in C, so I don't know if this will be exactly correct, but a regex for this should be something like:

\d{2}\/\d{2}\/\d{4}

If you don't know how to use regex, see this link for how to compile it. You can find many tutorials on C regex on the web as well for more complex algorithms.

Community
  • 1
  • 1
Jonathan Lam
  • 16,831
  • 17
  • 68
  • 94