regex for 24h unix timestamp

Question

I would like to create a regex for a 24h unix timestamp starting from, say: 01/01/2015 00:00:00 **(1420066800)** to 01/01/2015 23:59:59 **(1420153199)**, which is a difference of 86399 sec. in the unix time stamp format.

I'm using the range_regex python lib, but it's buggy for such a huge ranges. The range_to_pattern method (range_to_pattern(1420066800, 1420153199)) would produce a regex of: 1420[0-1][5-6][3-6][1-8]\\d{2} This is fine for the static bounds to create the regex, but when it comes to values like: 1420159111 since the 7 digit (9) from the left is not in the third range group ([3-6]).

Can someone provide a better python3 lib or a workaround on how to create a regex for 86400 sec. of a day?

Possible duplicate of [Using regular expressions to validate a numeric range](https://stackoverflow.com/questions/22130429/using-regular-expressions-to-validate-a-numeric-range) — Sebastian Simon, Jul 26 '17 at 12:32
This kind of regex is uneadable, FYI, the first element of it is: `1420(?:066[89][0-9]|06[7-9][0-9]{2}|0[7-9][0-9]{3}` — Toto, Jul 26 '17 at 12:35
@Phylogenesis: It is a regex problem. I have a thousand files where the filename contains a unix timestamp. What I want to do is to have an efficient way to collect all the files within 24h and put them to an archive. The fastes way I can think of to find these files is using a regex. — Ralph Lo, Jul 26 '17 at 12:41
@Toto I know it is unneadable. Since I wouldn't like to invent the wheel again, do you know a python lib which creates the regex you mentioned for me? — Ralph Lo, Jul 26 '17 at 12:43
@RalphHeerich [this library](https://github.com/dimka665/range-regex) claims to do what you want, but I haven't tested it at all. — Phylogenesis, Jul 26 '17 at 12:43
@Phylogenesis: It is the lib I'm already using. As I said, it's buggy for ranges like what I have. But anyway, thanks for the suggestion. — Ralph Lo, Jul 26 '17 at 12:46
Sorry, I don't know other lib. But I'd do a script that read the directory, transform the timestamp to date then compare with a reference date. — Toto, Jul 26 '17 at 12:49
You can use a regex to identify the pattern, but by extracting the timestamp as a group you can easily extract the timestamp and use datetime to perform the comparison on the extracted group, as @Toto suggested. In addition to likely being faster (as a simpler regex), it will make your intent much clearer. — K. Nielson, Jul 26 '17 at 12:59
Looking at the library, you should be using `range_to_regex()` rather than `range_to_pattern()`. — Phylogenesis, Jul 26 '17 at 13:06

score 1 · Accepted Answer · answered Jul 26 '17 at 13:10

1

As per my comment above, you are using the wrong function from that library.

You should use the following:

range_to_regex(1420066800, 1420153199)

This returns the correct regex:

142006680\d|14200668[1-9]\d|14200669\d{2}|142006[7-9]\d{3}|14200[7-9]\d{4}|14201[0-4]\d{4}|142015[0-2]\d{3}|1420153[0-1]\d{2}

answered Jul 26 '17 at 13:10

Phylogenesis

7,775
19
27

I just saw that the newest source code on github is more recent than the one I installed with pip. I might just clone the github lib and try to get along with that. Thank you very much! – Ralph Lo Jul 26 '17 at 13:13

score 1 · Answer 2 · answered Jul 26 '17 at 13:13

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"1420([0]([6]([6]([8]([0][0-9])|[9][0-9]{2})|[7-9][0-9]{3})|[7-9][0-9]{4})|[1]([5]([3]([1]([9][0-9]|[0-8][0-9]{1})|[0][0-9]{2})|[0-2][0-9]{3})|[0-4][0-9]{4}))"

test_str = ("01/01/2015 00:00:00 (1420066800) до 01/01/2015 23:59:59 (1420153199)\n\n"
    "1420016799     -no\n"
    "1420066799     -no\n"
    "1420066800     -yes\n"
    "1420066801     -yes\n"
    "1420067820     -yes\n"
    "1420067920     -yes\n"
    "1420073199     -yes\n"
    "1420103199     -yes\n"
    "1420152191     -yes\n"
    "1420153181     -yes\n"
    "1420153199     -yes\n"
    "1420153200     -no\n"
    "1420163199     -no")

matches = re.finditer(regex, test_str)

for matchNum, match in enumerate(matches):
    matchNum = matchNum + 1

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

Online: https://regex101.com/r/blnST4/1

regex for 24h unix timestamp

2 Answers2