-2

I'm trying, with no sucess, to get a number inside of the last occurence of a pattern inside of a HTML code. The pattern is data\\[\d{1,3}\\]. How can i get the number 03 in the below example?

<body>
<h2>JavaScript Regular Expressions</h2>
<p>TEST</p>
<p>data[01]</p>
<button onclick="myFunction()">Try it</button>
<p>data[02]</p>
<p>TEST</p>
<p id="demo" test=data[03]></p>
</body>

I tried many combinations with $, but I could not make it work.

Maheer Ali
  • 35,834
  • 5
  • 42
  • 73
Mark F
  • 13
  • 1

2 Answers2

-1

You can use a negative lookahead assertion.

/data\[(\d{1,3})\](?![\s\S]*data\[\d{1,3}\])/
  1. data\[(\d{1,3})\] for matching your pattern and (...) is for capturing the digit value.
  2. (?![\s\S]*data\[\d{1,3}\]) is negative lookahead asserion, in current case this helps to match any data-{digit} format string which doesn't follows the same format anywhere after that. Where [\s\S] is used to match anything since . excludes newline character.

const data = `data<body>
<h2>JavaScript Regular Expressions</h2>
<p>TEST</p>
<p>data[01]</p>
<button onclick="myFunction()">Try it</button>
<p>data[02]</p>
<p>TEST</p>
<p id="demo" test=data[03]></p>
</body>`

console.log(data.match(/data\[(\d{1,3})\](?![\s\S]*data\[\d{1,3}\])/)[1])

Detailed Regex explanation and demo here.


UPDATE : If it always the test attribute's value of the p tag then you can use DOMParser or a dummy DOM element to do the trick since RegExp is not a good method to parse HTML.

REF : Using regular expressions to parse HTML: why not?

  1. Using DOMParser :

const data = `data<body>
    <h2>JavaScript Regular Expressions</h2>
    <p>TEST</p>
    <p>data[01]</p>
    <button onclick="myFunction()">Try it</button>
    <p>data[02]</p>
    <p>TEST</p>
    <p id="demo" test=data[03]></p>
    </body>`

var parser = new DOMParser();
var doc = parser.parseFromString(data, "text/html");

console.log(doc.querySelector('p#demo').getAttribute('test').match(/\d+/)[0])
  1. Creating a dummy element

const data = `data<body>
    <h2>JavaScript Regular Expressions</h2>
    <p>TEST</p>
    <p>data[01]</p>
    <button onclick="myFunction()">Try it</button>
    <p>data[02]</p>
    <p>TEST</p>
    <p id="demo" test=data[03]></p>
    </body>`

// create a dummy element and set content
var div = document.createElement('div');
div.innerHTML = data;

console.log(div.querySelector('p#demo').getAttribute('test').match(/\d+/)[0])
Pranav C Balan
  • 113,687
  • 23
  • 165
  • 188
  • This is almost what i'm looking for but i can make it work with this. This returns all the lat pattern "data[03]" but i can use replace and strip all the non digits. It's possible to get the number with one regex (just for learning purposes)? Many thanks. – Mark F Mar 11 '19 at 18:12
  • @MarkF : You can get the capturing value `console.log(data.match(/data\[(\d{1,3})\](?![\s\S]*data\[\d{1,3}\])/)[1])` – Pranav C Balan Mar 11 '19 at 18:14
-1

JS .match() returns an array, so just get the last array item from that: arr[arr.length - 1]

var htmlString = document.body.innerHTML;
var match = htmlString.match(/data\[\d{1,3}\]/g);
console.log("ALL MATCHES", match);
if (match) {
  console.log("LAST MATCH", match[match.length - 1]);
}
<body>
  <h2>JavaScript Regular Expressions</h2>
  <p>TEST</p>
  <p>data[01]</p>
  <button onclick="myFunction()">Try it</button>
  <p>data[02]</p>
  <p>TEST</p>
  <p id="demo" test=data[03]></p>
</body>
Chris Barr
  • 29,851
  • 23
  • 95
  • 135