0

I'm trying to get a list of all inline event tags from an HTML <body> string, how would I be able to do this?

Example:

 <a onclick='foo()'></a>

I'd want to extract onclick='foo()'.

Is it possible with REGEX or any other alternatives?

Felix Kling
  • 795,719
  • 175
  • 1,089
  • 1,143
Trevor
  • 1,333
  • 6
  • 18
  • 32
  • yes **[take a look here](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454)** – qwertymk Oct 16 '11 at 07:53
  • Yes, I've read that,, wondering if there are any alternatives. Although events are very constant in terms of inline, so REGEX will probably work. – Trevor Oct 16 '11 at 08:12
  • You can iterate over each element and access its `onXXXXX` attribute... – Felix Kling Oct 16 '11 at 08:21
  • Couldn't, cause I'm doing this in Node.js and the JSDOM module ironically strips inline events. – Trevor Oct 16 '11 at 21:58

2 Answers2

0

Here's one. The event-thing will be group 1:

<\w+[^>]+(on[a-z]+=["'][^"']+["'])[^>]*>
Tetaxa
  • 4,375
  • 1
  • 19
  • 25
  • That's weird, works fine for me. Have you tried it with case insensitive search? – Tetaxa Oct 17 '11 at 07:25
  • If the problem is the group, btw, you could use a little more naive regex like `(on[a-z]+=["'][^"']+["'])(?=[^>]*>)` – Tetaxa Oct 17 '11 at 07:59
0

You should let the browser do the parsing, for example like this:

var doc = document.implementation.createHTMLDocument('');
doc.documentElement.innerHTML = '<body onload="alert(1)"></body>'; // your string here

Then get the on* attributes using DOM methods:

var attributes = Array.prototype.slice.call(doc.body.attributes);
for (var i = 0; i < attributes.length; i++) {
    if (/^on/.test(attributes[i].name)) {
        console.log(attributes[i].name, attributes[i].value);
    }
}
user123444555621
  • 148,182
  • 27
  • 114
  • 126
  • I agree you should use the DOM or some library if you're actually in a browser. Otoh, if you're searching through the code for inline event handlers to replace them with something else, regex is probably easier. – Tetaxa Oct 16 '11 at 20:49
  • The reason why I need this is that I'm doing this in node.js and ironically the jsdom module is ignoring inline events. – Trevor Oct 16 '11 at 21:06