1

I am trying to test that a timestamp (let's use HH:MM:ss as an example) does not have any numeric characters surrounding it, or to say that I would like to check for the presence of a non-numeric character before and after my timestamp. The non-numeric character does not need to exist, but no numeric character should exist directly before nor directly after. I do not want to capture this non-numeric character. How can I do this? Should I use "look-arounds" or non-capturing groups?

Fill in the blank + (2[0-3]:[0-5][0-9]:[0-5][0-9]|[0-1][0-9]:[0-5][0-9]:[0-5][0-9]) + Fill in the blank

Thanks!

WebWanderer
  • 10,380
  • 3
  • 32
  • 51
  • Use a simple `\D` pattern. – Wiktor Stribiżew Jan 12 '16 at 15:11
  • This isn't a duplicate, I didn't think about `\D`. I was wondering how I was going to use `^` in a `(?:...)`. The question that mine was marked as a duplicate of has nothing to do with non-capturing groups, just about matching a non-numeric character, which I already knew how to do. Thanks a lot @stribizhev – WebWanderer Jan 12 '16 at 15:31
  • 1
    Does it mean you wanted to *check for a presence of a non-numeric character* before and after the timestamp so that a match will fail if there are no non-numeric characters before and after the timestamp? Are you looking for lookarounds? `(?<=\D)PATTERN(?=\D)`? Please edit your question with sample input, expected output/result you are looking for. Without that, the question looks like an exact duplicate. – Wiktor Stribiżew Jan 12 '16 at 15:45
  • That is what I am looking for @stribizhev , but I am very new to regex. I am afraid if I re-word it to say that I will really show that I do not know what I am talking about, but I'll try. – WebWanderer Jan 12 '16 at 15:49
  • @stribizhev Just edited my question, and I still don't agree that it ever looked like an **exact** duplicate. The question linked here only states how to match a non-numeric, which was only a portion of my question (_the portion I knew how to do_). This is about, as you said, look-arounds, or as I said, non-capturing groups. I already found regex that works for me: `(?:\D*)(2[0-3]:[0-5][0-9]:[0-5][0-9]|[0-1][0-9]:[0-5][0-9]:[0-5][0-9])(?:\D*)` – WebWanderer Jan 12 '16 at 15:55
  • 1
    So, in fact, you need `(?<!\d)PATTERN(?!\d)`. OK, I will reopen since it does not look that basic now. – Wiktor Stribiżew Jan 12 '16 at 15:58
  • @stribizhev Hmm, I just tested that regex and it works great. Thank you. I like that much more than what I had before. Here it is: `(?<!\d)(2[0-3]:[0-5][0-9]:[0-5][0-9]|[0-1][0-9]:[0-5][0-9]:[0-5][0-9])(?!\d)` – WebWanderer Jan 12 '16 at 16:01

2 Answers2

2

The regex class for "anything that is not numeric" is:

\D

This is equivalent to:

[^\d]

So you would use:

\D*(2[0-3]:[0-5][0-9]:[0-5][0-9]|[0-1][0-9]:[0-5][0-9]:[0-5][0-9])\D*

You don't need to surround it with a non-capturing group (?:).

nickb
  • 59,313
  • 13
  • 108
  • 143
  • I do need to use a `(?:)`. I don't want to capture the character before or after my timestamp, I just want the timestamp and I don't want any numbers surrounding it, as I thought I stated well above. – WebWanderer Jan 12 '16 at 15:32
  • Oh wait, I guess I wouldn't need the non-capturing group because it was never in a capturing group and wouldn't be captured anyway? Ugh, I'm so new to this... – WebWanderer Jan 12 '16 at 15:50
  • @WebWanderer - Exactly. If it's not in a capturing group, it won't be captured, it'll just be used to match against. – nickb Jan 12 '16 at 15:57
2

I would like to check for the presence of a non-numeric character before and after my timestamp. The non-numeric character does not need to exist, but no numeric character should exist directly before nor directly after. I do not want to capture this non-numeric character.

The best way to match such a timestamp is using lookarounds:

(?<!\d)(2[0-3]:[0-5][0-9]:[0-5][0-9]|[0-1][0-9]:[0-5][0-9]:[0-5][0-9])(?!\d)

The (?<!\d) fails a match if there is a digit before the timestamp and (?!\d) fails a match if there is a digit after the timestamp.

If you use

\D*(2[0-3]:[0-5][0-9]:[0-5][0-9]|[0-1][0-9]:[0-5][0-9]:[0-5][0-9])\D*

(note that (?:...) non-capturing groups only hamper the regex engine, the patterns inside will still match, consume characters), you won't get overlapping matches (if there is a timestamp right after the timestapmp). However, this is a rare scenario I believe, so you still can use your regex and grab the value inside capture group 1.

Also, see my answer on How Negative Lookahead Works. A negative lookbehind works similarly, but with the text before the matching (consuming) pattern.

A JS solution is to use capturing groups:

var re = /(?:^|\D)(2[0-3]:[0-5][0-9]:[0-5][0-9]|[0-1][0-9]:[0-5][0-9]:[0-5][0-9])(?=\D|$)/g;
var text = "Some23:56:43text here Some13:26:45text there and here is a date 10/30/89T11:19:00am";
while ((m=re.exec(text)) !== null) {
  document.body.innerHTML += m[1] + "<br/>";
}
Community
  • 1
  • 1
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Ouch, I should have mentioned that I am doing this in Javascript. Took me a while to realize that Javascript doesn't work with `(?<!...` Any suggestions? – WebWanderer Jan 15 '16 at 15:44
  • I added a solution to match multiple timestamps. – Wiktor Stribiżew Jan 15 '16 at 15:49
  • 1
    OMG! You are fantastic sir! I added a UTC date to your code example to test separating the timestamp from the dateString (which is exactly what I am trying to do) and it works great! Thank you! This will help lay some better stepping stones for my DateTime to Date Format filter that I am writing for AngularJS! – WebWanderer Jan 15 '16 at 18:34