Can any one suggest an expression to extract only tagnames from a html string?on to extract only tagnames from a html string?
Asked
Active
Viewed 56 times
0
-
1Take a look here: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – David Underwood Mar 24 '11 at 17:26
2 Answers
0
<TAG\b[^>]*>(.*?)</TAG>
matches the opening and closing pair of a specific HTML tag. Anything between the tags is captured into the first backreference.

cgon
- 1,955
- 2
- 23
- 37
-
Hi, Thank you for your response. I am trying to grab all the tagnames using an expression from a html string and populate them in an array – chaitra Mar 24 '11 at 17:25
-
Maybe it might not the answer that you are looking for but.Grabbing the rough html strings with simple regex and than reading it character by character than extracting the words between '<' and '>' characters can help you constructing the array. Than you can edit the array according to your needs. – cgon Mar 24 '11 at 18:02
0
You can use: <(?<tagName>[a-z][a-z0-9]*[^<> ]*)
.
The tagName
capture group will contain the names of all opening tags.
If you want to capture closing tags as well, use: <(?<tagName>/?[a-z][a-z0-9]*[^<> ]*)
, closing tags will have /
as the first character.
Edit -- JS code:
To get the values into an array
var subject= "<html><head></head><body></body></html>";
var results = new Array();
var index = 0;
var regex = /<([a-z][a-z0-9]*[^<> ]*)/g;
var match = regex.exec(subject);
while (match !== null) {
results[index++] = match[1];
match = regex.exec(subject);
}
alert(results);
PS: Like it's been said elsewhere, do not try to parse HTML using regex. You'll just be asking for pain and misery. But to only strip out tags, this should work.

Nithin Philips
- 331
- 1
- 6
-
PS: If you're using GNU EGrep, named group are not suppored, so just remove `?
`. – Nithin Philips Mar 24 '11 at 17:27 -
Hi, Thank you for your response. Can you pls advise if i can use <(?
[a-z][a-z0-9]*[^<> ]*) in javascript? – chaitra Mar 24 '11 at 17:29 -
No, no named groups support in Javascript. You have to use numbered groups. – Nithin Philips Mar 24 '11 at 17:37
-
Hi, I am new to javascript can you pls advise on how to code it in javascript using numbered groups?. All i am trying to do is for example var html =
hello
. I should extract all the tagnames into an array so that it should have the values -> body, p, div. – chaitra Mar 24 '11 at 17:48 -
Hi Nithin, Thank you so much i really appreciate your response. The logic worked for my requirement. – chaitra Mar 24 '11 at 18:28
-
You're welcome. Could you then mark the answer as accepted, it'll help people reading this in the future and I'll earn some points too :) – Nithin Philips Mar 24 '11 at 18:34