160

Here is a regex that works fine in most regex implementations:

(?<!filename)\.js$

This matches .js for a string which ends with .js except for filename.js

Javascript doesn't have regex lookbehind. Is anyone able put together an alternative regex which achieve the same result and works in javascript?

Here are some thoughts, but needs helper functions. I was hoping to achieve it just with a regex: http://blog.stevenlevithan.com/archives/mimic-lookbehind-javascript

James
  • 80,725
  • 18
  • 167
  • 237
daniel
  • 1,733
  • 2
  • 11
  • 9
  • 4
    if you just need to check a specific filename or list of filenames, why not just use two checks? check if it ends in .js and then if it does, check that it doesn't match filename.js or vice versa. – si28719e Sep 11 '11 at 04:10
  • 4
    Update: The latest public Chrome version (v62) includes (presumably experimental) lookbehinds out of the box :D Note however that lookbehinds are still in proposal stage 3: https://github.com/tc39/proposal-regexp-lookbehind . So, it may take a while until JavaScript everywhere supports it. Better be careful about using in production! – Eirik Birkeland Nov 07 '17 at 14:08
  • 3
    Just use **`(?<=thingy)thingy`** for _positive lookbehind_ and **`(?<!thingy)thingy`** for _negative lookbehind_. **Now it supports them.** – Константин Ван Feb 08 '18 at 16:51
  • 8
    @K._ As of Feb 2018 **that's not true** yet!! And it will need some time because browsers and engines must implement the specification (current in draft). – Andre Figueiredo Feb 22 '18 at 14:23
  • 2
    @AndreFigueiredo Yes, you're right. The **proposal** is currently on _Stage 4_. Maybe I was thinking of only _Chrome_, I guess. – Константин Ван Mar 07 '18 at 12:05
  • 3
    # Update: ES2018 includes [lookbehind assertions](https://github.com/tc39/proposal-regexp-lookbehind) [Plus](https://mathiasbynens.be/notes/es-regexp-proposals): - dotAll mode (the s flag) - Lookbehind assertions - Named capture groups - Unicode property escapes – Ashley Coolman Jan 26 '18 at 11:29
  • nodejs http://kangax.github.io/compat-table/es2016plus/ supports it – Muhammad Umer May 17 '19 at 02:55
  • 1
    Firefox still hasn't implemented the 2018 specification which prescribes support for look-behinds. Here's the [bug](https://bugzilla.mozilla.org/show_bug.cgi?id=1225665). – Lonnie Best Dec 08 '19 at 00:27
  • 1
    @LonnieBest meanwhile fixed for FF ([5 days ago](https://bugzilla.mozilla.org/show_bug.cgi?id=1225665#c35)) :-) – Wolf May 19 '20 at 13:58
  • @Wolf : That's fantastic news. When will it land? Version 77? – Lonnie Best May 19 '20 at 14:42
  • looks [like 78](https://bugzilla.mozilla.org/show_bug.cgi?id=1634135) (see ` Milestone: mozilla78`) – Wolf May 19 '20 at 16:15
  • Still not supported for safari @ 2022 – Kavinda Jayakody Sep 24 '22 at 04:28
  • As of Jan 12, The latest Safari Technology Preview release 161 (https://bugs.webkit.org/show_bug.cgi?id=174931#c56) supports lookbehind. – Hrvoje Zlatar Jan 24 '23 at 17:13

8 Answers8

168

EDIT: From ECMAScript 2018 onwards, lookbehind assertions (even unbounded) are supported natively.

In previous versions, you can do this:

^(?:(?!filename\.js$).)*\.js$

This does explicitly what the lookbehind expression is doing implicitly: check each character of the string if the lookbehind expression plus the regex after it will not match, and only then allow that character to match.

^                 # Start of string
(?:               # Try to match the following:
 (?!              # First assert that we can't match the following:
  filename\.js    # filename.js 
  $               # and end-of-string
 )                # End of negative lookahead
 .                # Match any character
)*                # Repeat as needed
\.js              # Match .js
$                 # End of string

Another edit:

It pains me to say (especially since this answer has been upvoted so much) that there is a far easier way to accomplish this goal. There is no need to check the lookahead at every character:

^(?!.*filename\.js$).*\.js$

works just as well:

^                 # Start of string
(?!               # Assert that we can't match the following:
 .*               # any string, 
  filename\.js    # followed by filename.js
  $               # and end-of-string
)                 # End of negative lookahead
.*                # Match any string
\.js              # Match .js
$                 # End of string
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • Works on lots of cases except where there are preceeding characters, for example: filename.js (works-nomatch) filename2.js (works-match) blah.js (works - match) 2filename.js (doesn't work - nomatch) --- having said that, the lookbehind has the same limitation which I didn't realise until now... – daniel Sep 11 '11 at 07:40
  • 10
    @daniel: Well, your regex (with lookbehind) also doesn't match `2filename.js`. My regex matches in exactly the same cases as your example regex. – Tim Pietzcker Sep 11 '11 at 17:51
  • Forgive my naivety but is there a use for the non capturing group here? I've always known that to be only useful when trying to glean back reference for replacement in a string. As far as I know, this too will work ^(?!filename\.js$).*\.js$ – I Want Answers Mar 28 '17 at 06:46
  • 1
    Not quite, that regex checks for "filename.js" only at the start of the string. But `^(?!.*filename\.js$).*\.js$` would work. Trying to think of situations where the ncgroup might still be necessary... – Tim Pietzcker Mar 28 '17 at 06:53
  • This approach can be summarized as: instead of looking behind X, look ahead at every character that comes before X? – Sarsaparilla Jun 29 '18 at 07:27
  • @HaiPhan: Yes, but re-reading my answer I just noticed that there is a vastly less complicated solution that I had completely overlooked. Will update my answer X-) – Tim Pietzcker Jun 29 '18 at 08:26
  • [Firefox](https://bugzilla.mozilla.org/show_bug.cgi?id=1225665) still doesn't support look-behinds as prescribed by the 2018 specification. I understand they're actively working on it though. – Lonnie Best Jan 17 '20 at 07:08
  • BTW, I really like how you break the RegEx apart and describe it so concisely. – Lonnie Best Jan 17 '20 at 07:10
  • Really appreciate the breakdown of what each component does. Thanks! – Kamal Jun 07 '21 at 17:46
68

^(?!filename).+\.js works for me

tested against:

  • test.js match
  • blabla.js match
  • filename.js no match

A proper explanation for this regex can be found at Regular expression to match string not containing a word?

Look ahead is available since version 1.5 of javascript and is supported by all major browsers

Updated to match filename2.js and 2filename.js but not filename.js

(^(?!filename\.js$).).+\.js

Community
  • 1
  • 1
Ben
  • 13,297
  • 4
  • 47
  • 68
  • 8
    That question you linked to talks about a slightly different problem: matching a string that doesn't contain the target word *anywhere*. This one is much simpler: matching a string that doesn't *start with* the target word. – Alan Moore Sep 11 '11 at 05:50
  • Thats really nice, it only misses out on cases like: filename2.js or filenameddk.js or similar. This is a no match, but should be a match. – daniel Sep 11 '11 at 07:18
  • 10
    @daniel You asked for a look-behind, not a look-ahead, why did you accepted this answer? – hek2mgl May 28 '15 at 08:56
  • I'm grave-digging here, but the updated one has a broken/useless capture group and matches "filename.js" in the string "filename.json". It should be `^(?!filename\.js$).+\.js$` – Domino Jul 16 '15 at 13:30
  • 1
    the given one does not match on `a.js` – inetphantom Mar 17 '16 at 12:35
  • 1
    The original regex with lookbehind doesn't match `2filename.js`, but the regex given here does. A more appropriate one would be `^(?!.*filename\.js$).*\.js$`. This means, match any `*.js` *except* `*filename.js`. – weibeld May 16 '17 at 05:16
26

Let's suppose you want to find all int not preceded by unsigned:

With support for negative look-behind:

(?<!unsigned )int

Without support for negative look-behind:

((?!unsigned ).{9}|^.{0,8})int

Basically idea is to grab n preceding characters and exclude match with negative look-ahead, but also match the cases where there's no preceeding n characters. (where n is length of look-behind).

So the regex in question:

(?<!filename)\.js$

would translate to:

((?!filename).{8}|^.{0,7})\.js$

You might need to play with capturing groups to find exact spot of the string that interests you or you want't to replace specific part with something else.

Kamil Szot
  • 17,436
  • 6
  • 62
  • 65
  • I just converted this: `(?<!barna)(?<!ene)(?<!en)(?<!erne) (?:sin|vår)e?(?:$| (?!egen|egne))` to `(?!barna).(?!erne).(?!ene).(?!en).. (?:sin|vår)e?(?:$| (?!egen|egne))` which does the trick for my needs. Just providing this as another "real-world" scenario. See [link](https://regex101.com/r/wL8nS7/5) – Eirik Birkeland Mar 16 '16 at 13:23
  • I think you meant: `((?!unsigned ).{9}|^.{0,8})int` – pansay Feb 04 '17 at 10:07
  • @pansay Yes. Thank you. I just corrected my response. – Kamil Szot Feb 13 '17 at 20:32
  • 2
    Thanks for the more generalized answer which works even where there is a need to match deep within the text (where initial ^ would be impractical)! – Milos Mrdovic Aug 17 '17 at 09:28
3

If you can look ahead but back, you could reverse the string first and then do a lookahead. Some more work will need to be done, of course.

Albert Friend
  • 65
  • 1
  • 1
3

This is an equivalent solution to Tim Pietzcker's answer (see also comments of same answer):

^(?!.*filename\.js$).*\.js$

It means, match *.js except *filename.js.

To get to this solution, you can check which patterns the negative lookbehind excludes, and then exclude exactly these patterns with a negative lookahead.

Community
  • 1
  • 1
weibeld
  • 13,643
  • 2
  • 36
  • 50
2

Thanks for the answers from Tim Pietzcker and other persons. I was so inspired by their works. However, there is no any ideal solution, I think, for mimicking lookbehind. For example, solution from Pietzcker is limited by $ as EOL, that is, without $ there would get unexpected result:

let str="filename.js  main.js  2022.07.01"
console.log( /^(?!.*filename\.js).*\.js/g.exec(str) ) //null

Another limitation is that it is hard to translate multiply lookbehinds, such as:

let reg=/(?<!exP0)exp0 \d (?<!exP1)exp1 \d (?<!exP2)exp2/

How to build a more generic and free method to use lookbehind assertion alternatively? Bellow is my solution.

The core pattern of alternative code is:

(?:(?!ExpB)....|^.{0,3})ExpA <= (?<!ExpB)ExpA

Detail explanation:

(?:         # start an unsave group:
 (?!ExpB)   # Assert a possion who can't match the ExpB
 ....       # Any string, the same length as ExpB
 |^.{0,3}   # Or match any string whoes length is less than ExpB
)           # End of negative lookahead
ExpA        # Match ExpA

For instance:

var str="file.js  main.js  2022.07.01"
var reg=/(?:(?!file)....|^.{0,3})\.js/g // <= (?<!file)\.js
console.log( reg.exec(str)[0] )  // main.js

Here is an implement to translate above pattern into a sugar:

var str="file.js  main.js  2022.07.01"
var reg=newReg("﹤4?!file﹥\\.js","g") //pattern sugar
console.log(reg.exec(str)[0]) // main.js

function newReg(sReg,flags){
  flags=flags||""
  sReg=sReg.replace(/(^|[^\\])\\﹤/g,"$1<_sl_>").replace(/(^|[^\\])\\﹥/g,"$1<_sr_>")
  if (/﹤\?<?([=!])(.+?)﹥/.test(sReg)){
    throw "invalid format of string for lookbehind regExp"
  }
  var reg=/﹤(\d+)\?<?([=!])(.+?)﹥/g
  if (sReg.match(reg)){
    sReg=sReg.replace(reg, function(p0,p1,p2,p3){
      return "(?:(?"+p2+p3+")"+".".repeat(parseInt(p1))+"|^.{0,"+(parseInt(p1)-1)+"})"
    })
  }
  sReg=sReg.replace(/<_sl_>/g,"﹤").replace(/<_sr_>/g,"﹥")
  var rr=new RegExp(sReg,flags)
  return rr
}

Two special characters ( \uFE64 or &#65124; ) and ( \uFE65 or &#65125; ) are used to enclose the lookbehind expression, and a number N counting the length of lookbehind expression must follow the . That is ,the syntax of lookbehind is:

﹤N?!ExpB﹥ExpA <= (?<!ExpB)ExpA
﹤N?=ExpB﹥ExpA <= (?<=ExpB)ExpA

To make the pattern above more ES5-like, you can replace or with parenthesis and remove N , by writing more code into newReg() function.

Jeremy Lee
  • 21
  • 2
1

I know this answer is not tackling really how to rewrite a regex to simulate lookbehinds, but i managed to overcome some very simple situations like this one by replacing the unwanted match from the string beforehand, as in:

  let string = originalString.replace("filename.js", "filename_js")
  string.match(/.*\.js/)
stashdjian
  • 21
  • 2
-1

Below is a positive lookbehind JavaScript alternative showing how to capture the last name of people with 'Michael' as their first name.

1) Given this text:

const exampleText = "Michael, how are you? - Cool, how is John Williamns and Michael Jordan? I don't know but Michael Johnson is fine. Michael do you still score points with LeBron James, Michael Green Miller and Michael Wood?";

get an array of last names of people named Michael. The result should be: ["Jordan","Johnson","Green","Wood"]

2) Solution:

function getMichaelLastName2(text) {
  return text
    .match(/(?:Michael )([A-Z][a-z]+)/g)
    .map(person => person.slice(person.indexOf(' ')+1));
}

// or even
    .map(person => person.slice(8)); // since we know the length of "Michael "

3) Check solution

console.log(JSON.stringify(    getMichaelLastName(exampleText)    ));
// ["Jordan","Johnson","Green","Wood"]

Demo here: http://codepen.io/PiotrBerebecki/pen/GjwRoo

You can also try it out by running the snippet below.

const inputText = "Michael, how are you? - Cool, how is John Williamns and Michael Jordan? I don't know but Michael Johnson is fine. Michael do you still score points with LeBron James, Michael Green Miller and Michael Wood?";



function getMichaelLastName(text) {
  return text
    .match(/(?:Michael )([A-Z][a-z]+)/g)
    .map(person => person.slice(8));
}

console.log(JSON.stringify(    getMichaelLastName(inputText)    ));
Piotr Berebecki
  • 7,428
  • 4
  • 33
  • 42