Your best bet is to use innerText
or textContent
to get at the text without the tags and then just use the regex /\d/g
to get the numbers.
function digitsInText(rootDomNode) {
var text = rootDomNode.textContent || rootDomNode.innerText;
return text.match(/\d/g) || [];
}
For example,
alert(digitsInText(document.body));
If your HTML is not in the DOM, you can try to strip the tags yourself : JavaScript: How to strip HTML tags from string?
Since you need to do a replacement, I would still try to walk the DOM and operate on text nodes individually, but if that is out of the question, try
var HTML_TOKEN = /(?:[^<\d]|<(?!\/?[a-z]|!--))+|<!--[\s\S]*?-->|<\/?[a-z](?:[^">']|"[^"]*"|'[^']*')*>|(\d+)/gi;
function incrementAllNumbersInHtmlTextNodes(html) {
return html.replace(HTML_TOKEN, function (all, digits) {
if ("string" === typeof digits) {
return "" + (+digits + 1);
}
return all;
});
}
then
incrementAllNumbersInHtmlTextNodes(
'<b>123</b>Hello, World!<p>I <3 Ponies</p><div id=123>245</div>')
produces
'<b>124</b>Hello, World!<p>I <4 Ponies</p><div id=123>246</div>'
It will get confused around where special elements like <script>
end and won't recognize digits that are entity encoded, but should work otherwise.