4

I have a string like so:

<p>Year: ={year}</p>\
<p>Director: ={director}</p>\
<ul>@{actors}<li class="#{class}">={actor}</li>{actors}</ul>\

And I want to extract all ={match} that are NOT inside @{word}...{word}, so in this case I want to match ={year} and ={director} but not ={actor}. This is what I got so far but it's not working.

/(?!@.*)=\{([^{}]+)\}/g

Any ideas?

Edit: My current solution is to find all ={match} inside @{}...{} and replace the = with something like =&. Then I grab the ones that are outside and finally I come back and replace the flagged ones back to their original state.

elclanrs
  • 92,861
  • 21
  • 134
  • 171
  • 1
    Can these `@{word}...{word}` constructs be nested? Also, there appear to be some "special characters" like `@`, `=` and `#` that appear before a `{...}` to distinguish it from a closing tag. Are those all, or are there others we need to take into account? – Tim Pietzcker Feb 05 '13 at 07:08
  • in javascript you want to use regex vs. DOM ? – CSᵠ Feb 05 '13 at 07:09
  • @TimPietzcker: Yes they can be nested, that's the whole reason to ask this question. I'm trying to loop those constructs and replace the inside match with something. So actually `={year}` and `={director}` are all inside another construct like that. `@` means "loop", `=` means match and `#` is to access the key directly. This is for a templating system. – elclanrs Feb 05 '13 at 07:10
  • http://stackoverflow.com/a/1732454/297114 – Oscar Mederos Feb 05 '13 at 07:15
  • @OscarMederos: I'm well aware of that answer, this is not the case though. As I said this is for a templating library that I'm working on , and I've been using regex just fine. My current solution works but is not ideal. – elclanrs Feb 05 '13 at 07:16
  • 1
    If you can have `@{tag1}...={word}...@{tag2}...{tag2}...{tag1}`, then you can't do it with a regex alone. JavaScript regexes don't support any kind of recursion. – Tim Pietzcker Feb 05 '13 at 07:25
  • does your @{.*} starts and ends with
      always?
      – Michael Feb 05 '13 at 07:29
    • 1
      @TimPietzcker: I'm using backreferences for the loops tho, `/@\{([^{}]+)\}(.+)\{\1\}/`, so `@{tag1}` will only match the closing `{tag1}` and inside that match I could run that same regex again... – elclanrs Feb 05 '13 at 07:29
    • 1
      OK, and one document can neither contain `@{tag}...@{tag}...{tag}...{tag}` nor `@{tag}...{tag}...@{tag}...{tag}`? – Tim Pietzcker Feb 05 '13 at 07:33
    • Exactly, because `tag` is actually a key from an object and keys can't be duplicated. Seems like I wasn't very clear sorry bout that. – elclanrs Feb 05 '13 at 07:35
    • 1
      @elclanrs: But a key inside a nested object could appear on different levels. I think regex is just not the best tool for this, have you tried a simple stack-based parser? – Bergi Feb 05 '13 at 07:36
    • @Bergi: Arrg, right, did't consider that case... I think in the end imma have to make a proper parser but for the time being I'd still ilke to know how I would grab what I need in this case **assuming** there can't be duplicated vars so my backreferences will work. – elclanrs Feb 05 '13 at 07:41

    2 Answers2

    3

    You can use regular expressions to break down the string into segments, like so:

    var s = '<p>Year: ={year}</p> \
    <p>Director: ={director}</p> \
    <ul>@{actors}<li class="#{class}">={actor}</li>{actors}</ul>',
      re = /@\{([^}]+)\}(.*?)\{\1\}/g,
      start = 0,
      segments = [];
    
    while (match = re.exec(s)) {
      if (match.index > start) {
        segments.push([start, match.index - start, true]);
      }
      segments.push([match.index, re.lastIndex - match.index, false]);
      start = re.lastIndex;
    }
    
    if (start < s.length) {
      segments.push([start, s.length - start, true]);
    }
    
    console.log(segments);
    

    Based on your example, you would get these segments:

    [
        [0, 54, true], 
        [54, 51, false], 
        [105, 5, true]
    ]
    

    The boolean indicates whether you're outside - true - or inside a @{}...{} segment. It uses a back-reference to match the ending against the start.

    Then, based on the segments you can perform replacements as per normal.

    Ja͢ck
    • 170,779
    • 38
    • 263
    • 309
    0

    here you go, needed to negative look ahead the ending {whatever}also

    /(?!@.*)=\{([^{}]+)\}(?!.*\{[^{}]+\})/g

    UPDATE:

    Previous only works for {match} per line.

    Hooking on @ would actually mean a LookBehind and it is difficult to use LookBehind in this case because LookBehind likes very much to know exactly how many chars to look for.

    So let's use LookAhead for looking ahead: =\{([^{}]+)\}(?![^@]*[^=@]{) with hooking on the end {tag}

    Edit: demo http://regex101.com/r/zH7uY6

    CSᵠ
    • 10,049
    • 9
    • 41
    • 64