0

I'm trying to match strings that do not have .jsp/.jspx extensions in Java and am having a lot of difficult with the negative lookahead pattern.

Given a bunch of strings:

String string1 = "templateName";
String string2 = "some/path"
String string3 = "basic/filename/no/extension"
String string4 = "some/path/to/file.jsp"
String string5 = "alternative/path/to/file.jspx"

I'm trying to find a regex that matches the first 3 and not the last 2.

I would have thought a regex with a negative lookahead would work.

Ex:

Pattern p = new Pattern.compile( "(.+)(?!\\.jsp[x]?)")

But that pattern seems to match all the above strings. I initially thought that group 1 might be too greedy, so I've tried (.+?), but that does not help either.

This SO Post does a very good job of explain the negative lookahead, but it isn't helping me unfortunately find the right combination.

Am I missing something obvious?

Community
  • 1
  • 1
Eric B.
  • 23,425
  • 50
  • 169
  • 316
  • Your pattern says, "If you can find _any_ non-empty sequence of characters that is not immediately followed by .jsp or .jspx, then it's a match." So of course it will match `some/path/to/file.jsp` because it could match `s`, `so`, or anything as long as it doesn't include the last slash. See anubhava's answers. The second one uses negative lookahead to fail if the beginning of the string is followed by a non-empty sequence of characters followed by `.jsp[x]` (if the `.jsp[x]` is at the end of the string). – ajb Oct 28 '13 at 19:32

2 Answers2

2

You can use negative lookbehind as:

Pattern p = new Pattern.compile( "^(.+)(?<!\\.jspx?)$" );

OR you can use negative lookahead as:

Pattern p = new Pattern.compile( "^(?!.+?\\.jspx?$)(.+)$" );
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Thanks for the answer. Unfortunately, I do not understand how either pattern works. Are you able to provide a little explanation as to how those patterns match? I've never been able to properly grasp the constructs of the lookahead or lookbehind exoressions. – Eric B. Oct 28 '13 at 19:35
  • @EricB.: FOr best explanation on lookarounds please visit this simple doc: http://www.regular-expressions.info/lookaround.html – anubhava Oct 28 '13 at 19:37
  • Once you're satisfied with my answer, please consider marking it as "accepted", so users facing a similar problem in the future will be able to see it easily. – anubhava Oct 28 '13 at 20:07
1

Here's another negative lookbehind:

Pattern p = new Pattern.compile(".*(?<!.jspx?)$");

(?<!.jspx?) is a negated lookbehind assertion, which means that before the end of the string, there is no .jsp or .jspx

You are looking behind the end of string $

Reference:

http://www.regular-expressions.info/lookaround.html

Regex not ending with

Community
  • 1
  • 1
Indu Devanath
  • 2,068
  • 1
  • 16
  • 17