regex to match strings not ending with a pattern?

Question

I am trying to form a regular expression that will match strings that do NOT end a with a DOT FOLLOWED BY NUMBER.

eg.

abcd1
abcdf12
abcdf124
abcd1.0
abcd1.134
abcdf12.13
abcdf124.2
abcdf124.21

I want to match first three.
I tried modifying this post but it didn't work for me as the number may have variable length.

Can someone help?

Simon Whitehead · Answer 1 · 2012-08-09T02:35:30.687

4

You can use something like this:

^((?!\.[\d]+)[\w.])+$

It anchors at the start and end of a line. It basically says:

Anchor at the start of the line
DO NOT match the pattern .NUMBERS
Take every letter, digit, etc, unless we hit the pattern above
Anchor at the end of the line

So, this pattern matches this (no dot then number):

This.Is.Your.Pattern or This.Is.Your.Pattern2012

However it won't match this (dot before the number):

This.Is.Your.Pattern.2012

EDIT: In response to Wiseguy's comment, you can use this:

^((?!\.[\d]+$)[\w.])+$ - which provides an anchor after the number. Therefore, it must be a dot, then only a number at the end... not that you specified that in your question..

edited Aug 09 '12 at 02:35

answered Aug 09 '12 at 02:06

Simon Whitehead

63,300
9
114
138

Doesn't match `abcdf12.12abc`. – Wiseguy Aug 09 '12 at 02:20
Updated my answer to accomodate that. – Simon Whitehead Aug 09 '12 at 02:35
1

Thanks. Side note - you could probably simplify a tad: `^((?!\.\d+$).)+$`. – Wiseguy Aug 09 '12 at 02:42
Thanks. I was thinking if the solution is possible without anchor symbols or its impossible due to some inherent structure of the problem? – Vinay Aug 09 '12 at 20:12
@Vinay Why do you need to avoid anchors? and, avoid _any_ anchor, or just avoid `$` (i.e., could you still use a word boundary `\b` or something)? – Wiseguy Aug 10 '12 at 16:17

score 2 · Answer 2 · answered Aug 09 '12 at 02:07

If you can relax your restrictions a bit, you may try using this (extended) regular expression: ^[^.]*.?[^0-9]*$

You may omit anchoring metasymbols ^ and $ if you're using function/tool that matches against whole string.

Explanation: This regex allows any symbols except dot until (optional) dot is found, after which all non-numerical symbols are allowed. It won't work for numbers in improper format, like in string: abcd1...3 or abcd1.fdfd2. It also won't work correctly for some string with multiple dots, like abcd.ab123cd.a (the problem description is a bit ambigous).

Philosophical explanation: When using regular expressions, often you don't need to do exactly what your task seems to be, etc. So even simple regex will do the job. An abstract example: you have a file with lines are either numbers, or some complicated names(without digits), and say, you want to filter out all numbers, then simple filtering by [^0-9] - grep '^[0-9]' will do the job.

But if your task is more complex and requires validation of format and doing other fancy stuff on data, why not use a simple script(say, in awk, python, perl or other language)? Or a short "hand-written" function, if you're implementing stand-alone application. Regexes are cool, but they are often not the right tool to use.

Thanks for the answer. Can you explain how I can avoid the anchor symbols? — Vinay, Aug 09 '12 at 20:10

score -1 · Answer 3 · answered Aug 09 '12 at 02:41

-1

I would just use a simple negative look-behind anchored at the end:

.*(?<!\\.\\d+)$

answered Aug 09 '12 at 02:41

Bohemian

412,405
93
575
722

1

Almost no regex engines support variable length lookbehinds. Seems that asker may have attempted this per the linked question but could not employ this method because of variable length. – Wiseguy Aug 09 '12 at 02:46

regex to match strings not ending with a pattern?

3 Answers3