3

I am terrible at regex so I will communicate my question a bit unconventionally in the name of trying to better describe my problem.

var TheBadPattern = /(\d{2}:\d{2}:\d{2},\d{3})/;
var TheGoodPattern = /([a-zA-Z0-9\-,.;:'"])(?:\r\n?|\n)([a-zA-Z0-9\-])/gi;

// My goal is to then do this
inputString = inputString.replace(TheGoodPattern, '$1 $2);

Question: I want to match all the good patterns and do the subsequent find/replace UNLESS they are proceeded by the bad pattern, any ideas on how? I was able to accomplish this in other languages that support lookbehind but I am at a loss without it? (ps: from what I understand, JS does not support lookahead/lookbehind or if you prefer, '?>!', '?<=')

Jason
  • 803
  • 1
  • 7
  • 14

2 Answers2

2

JavaScript does support lookaheads. And since you only need a lookbehind (and not a lookahead, too), there is a workaround (which doesn't really aid the readability of your code, but it works!). So what you can do is reverse both the string and the pattern.

inputString = inputString.split("").reverse().join("");
var pattern = /([a-z0-9\-])(?:\n\r?|\r)([a-z0-9\-,.;:'"])(?!\d{3},\d{2}:\d{2}:\d{2})/gi
inputString = inputString.replace(TheGoodPattern, '$1 $2');
inputString = inputString.split("").reverse().join("");

Note that you had redundantly used the upper case letters (they are being taken care of the i modifier).

I would actually test it for you if you supplied some example input.

Martin Ender
  • 43,427
  • 11
  • 90
  • 130
1

I have also used the reverse methodology recommended by m.buettner, and it can get pretty tricky depending on your patterns. I find that workaround works well if you are matching simple patterns or strings.

With that said I thought I would go a bit outside the box just for fun. This solution is not without its own foibles, but it also works and it should be easy to adapt to existing code with medium to complicated regular expressions.

http://jsfiddle.net/52QBx/

js:

function negativeLookBehind(lookBehindRegExp, matchRegExp, modifiers)
{
    var text = $('#content').html();
    var badGoodRegex = regexMerge(lookBehindRegExp, matchRegExp, modifiers);
    var badGoodMatches = text.match(badGoodRegex);
    var placeHolderMap = {};
    for(var i = 0;i<badGoodMatches.length;i++)
    {
        var match = badGoodMatches[i];
        var placeHolder = "${item"+i+"}"
        placeHolderMap[placeHolder] = match;
        $('#content').html($('#content').html().replace(match, placeHolder));
    }

    var text = $('#content').html();
    var goodRegex = matchRegExp;
    var goodMatches = text.match(goodRegex);

    for(prop in placeHolderMap)
    {
        $('#content').html($('#content').html().replace(prop, placeHolderMap[prop]));
    }
    return goodMatches;
}
function regexMerge(regex1, regex2, modifiers)
{
    /*this whole concept could be its own beast, so I just asked to have modifiers for the combined expression passed in rather than determined from the two regexes passed in.*/
    return new RegExp(regex1.source + regex2.source, modifiers);
}
var result = negativeLookBehind(/(bad )/gi, /(good\d)/gi, "gi");
alert(result);

​ html:

<div id="content">Some random text trying to find good1 text but only when that good2 text is not preceded by bad text so bad good3 should not be found bad good4 is a bad oxymoron anyway.</div>​

The main idea is find all the total patterns (both the lookbehind and the real match) and temporarily remove those from the text being searched. I utilized a map as the values being hidden could vary and thus each replacement had to be reversible. Then we can run just the regex for the items you really wanted to find without the ones that would have matched the lookbehind getting in the way. After the results are determined we swap back in the original items and return the results. It is a quirky, yet functional, workaround.

Community
  • 1
  • 1
purgatory101
  • 6,494
  • 1
  • 20
  • 21