Validating timestamp string for embedded application

Question

I need to validate a timestamp string for one of my embedded applications. The SDK does not provide regex.h so I need to come up with another solution.

I been googling and found some lightweight regex alternatives on github but I wanted to see if there is a better/simpler alternative before I start to integrate that into the build.

Any suggestion how to make such a function in C? The string will have the format: YYYY-MM-DD HH:MM:SS. I control this format too so if another is better I can adopt to that.

If you know the string is in `YYYY-MM-DD HH:MM:SS` format, why do you need to pattern match it? Probably you mean extracting the `Y`,`M`,`D`,`H`,`M`,`S` values? — nice_dev, Nov 14 '18 at 19:07
You're right. I miss expressed the question and it should have said validate instead of pattern matching. I have changed the title now. — Fever, Nov 15 '18 at 07:05
It is a very narrow and simple requirement; unless you will be validating other differently formatted strings, a general purpose matching/validating library will add a prohibitively large amount of code. Just read the delimited tokens and validate them - you will write perhaps more code, but that code will be smaller than any general purpose library code you might otherwise import. — Clifford, Nov 15 '18 at 20:03

Swordfish · Accepted Answer · 2018-11-15T08:02:54.000

1

By "pattern-match" I assume you want to know if such a string is valid.

#include <stdbool.h>
#include <string.h>

bool is_leap_year(int year)
{
    return (year & 3) == 0 && ((year % 25) != 0 || (year & 15) == 0); // *)
}

bool in_range(int min, int value, int max)
{
    return min <= value && value <= max;
}

bool is_valid_timestamp(char const *datetime)
{
    int const days_per_month[] = { 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31 };
    int y, m, d, h, min, sec;
    char seperators[5];

    return strlen(datetime) == 19
        && sscanf(datetime, "%d%c%d%c%d%c%d%c%d%c%d", &y, &seperators[0],
                  &m, &seperators[1], &d, &seperators[2], &h, &seperators[3], 
                  &min, &seperators[4], &sec) == 11
        && in_range(0, y, 9999) && in_range(1, m, 12)
        && in_range(1, d, m == 2 && is_leap_year(y) ? 29 : days_per_month[m - 1])
        && in_range(0, h, 23) && in_range(0, min, 59) && in_range(0, sec, 59)
        && strncmp(seperators, "-- ::", 5) == 0;
}

in_range(0, y, 9999) ... or whatever you consider a "valid" year.

*) https://stackoverflow.com/a/11595914/3975177

edited Nov 15 '18 at 08:02

answered Nov 14 '18 at 19:09

Swordfish

12,971
3
21
43

I must say that I'm impressed. Works like a charm! Thank you very much for your time. – Fever Nov 15 '18 at 07:06
Why `year % 25` instead of `year % 100`? They both end up working the same, but `% 100` uses the actual value the formula is based on (Not years divisible by 100, except years divisible by 400). Just a preference to use the smallest number possible, or is there an efficiency gain I'm missing? I can certainly understand why you would avoid another division with `& 15` instead of `% 400` though. – Sam Skuce Nov 16 '18 at 18:34
1

Please have a look at the link in my answer: "The 100th year test utilizes modulo 25 instead of modulo 100. We can do this because 100 factors out to 2 x 2 x 5 x 5. Because the 4th year test already checks for factors of 4 we can eliminate that factor from 100, leaving 25. This optimization is probably insignificant to nearly every CPU implementation (as both 100 and 25 fit in 8-bits)." – Swordfish Nov 16 '18 at 18:45

Validating timestamp string for embedded application

1 Answers1