1

Let's say I have the following code:

<div class="set">
    <span data-prefix="[1]">Duck</span>
    <span data-prefix="[2]">Dog</span>
    <span data-prefix="[3]">Cat</span>
</div>

I need Regex to strip all of the HTML except for the value of data-prefix.

So the expected output should be:

[1]Duck[2]Dog[3]Cat

I can't figure out how to do this, how can I?

Josh Crozier
  • 233,099
  • 56
  • 391
  • 304
kinx
  • 463
  • 5
  • 12
  • 31

3 Answers3

2

Don't use regular expressions to parse HTML. You can simply use JavaScript in this case.

Iterate over the elements with data-prefix attributes and access the attribute value with dataset.prefix. Then concatenate that with the textContent property value:

var elements = document.querySelectorAll('.set > [data-prefix]'),
    result = '';

for (var i = 0; i < elements.length; i++) {
  result += elements[i].dataset.prefix + elements[i].textContent;
}

console.log(result); // [1]Duck[2]Dog[3]Cat

If you absolutely have to use regular expressions, I suppose you could use the following:

/(?:<span data-prefix="([^"]+)">([^<]+)<\/span>)+/g

It would return the following: (example)

1) ([1])(Duck)
2) ([2])(Dog)
3) ([3])(Cat)
Community
  • 1
  • 1
Josh Crozier
  • 233,099
  • 56
  • 391
  • 304
  • 1
    +1 for parsing HTML with regex link. That's what comes to mind when any SO browser who's seen that finds somebody attempting to do so.. ;) – Nebula Jan 02 '16 at 22:22
  • Looks nice, however I do have spans that do NOT have a data-prefix attribute. How can I have those included in the results too? (pretty much just a span with a letter) – kinx Jan 02 '16 at 22:26
1
// Get all the dom nodes with the data-X

var nodes = document.querySelectorAll('[data-prefix]'),
    values = [];

for (var i = 0; i < nodes.length; i++) {
    values.push(nodes[i].dataset.prefix + nodes.textContent);
}

Now you have a n array with all the values that you need.

CodeWizard
  • 128,036
  • 21
  • 144
  • 167
0

however I do have spans that do NOT have a data-prefix attribute. How can I have those included in the results

This code JS Fiddle select all span within the parent div.set, grabs its attribute data-prefix and if it is there it'll output it, otherwise it outputs [-] instead

var divSpans = document.querySelectorAll('.set > span');

for (var i = 0; i < divSpans.length; i++) {
  var prefix = divSpans[i].getAttribute('data-prefix'),
    divHTML = divSpans[i].textContent;

  prefix = (prefix) ? prefix : '[-]';

  divSpans[i].innerHTML = prefix + divHTML;

}
<div class="set">
  <span data-prefix="[1]">Duck</span>
  <span data-prefix="[2]">Dog</span>
  <span data-prefix="[3]">Cat</span>
  <span>Falcon</span>
  <span data-prefix="[4]">Parrot</span>
</div>
Mi-Creativity
  • 9,554
  • 10
  • 38
  • 47