-1

I guess it's trivial but I'm struggling with the following situation:

String (HTML delievered as text):

text<br>text<br><ul><li>text</li></ul><br>

Now I need to replace every text<br> with <div>text</div> except if text is inside <li>/<ul>.

.replace(/(.*?)<br>/g, '<div>$1</div>')

This works fine but how to prevent <ul><li>text</li></ul><br> from beeing replaced?

mbe
  • 11
  • 4

3 Answers3

1

This was my attempt before asking for a (shorter) regex solution:

const dFrag = document.createDocumentFragment();
str.textContent.split('<br>').forEach(substr => {
  const div = document.createElement('div');
  let ul;
  if (!substr) {
    substr = '<br>';
  }
  div.innerHTML = substr;
  ul = div.querySelector('ul');
  if (ul) {
    dFrag.appendChild(ul);
  } else {
    dFrag.appendChild(div);
  }
});
str.innerHTML = '';
str.appendChild(dFrag);
mbe
  • 11
  • 4
0

"You can't parse [HTML] with regex. [...] Have you tried using an [HT]ML parser instead?"

(a terser version can be found in the snippet below)

function replaceTextBrWithDiv(html) {
  // Create an element that acts as a parser
  const parser = document.createElement('div');
  parser.innerHTML = html;

  // Modify an array-like when iterating over it may cause some issues.
  // Copy it first.
  const childNodes = [...parser.childNodes];

  // Index-based iterating
  for (let index = 0; index < childNodes.length; index++) {
    const node = childNodes[index];
    const nextNode = childNodes[index + 1];

    if (node instanceof Text && nextNode instanceof HTMLBRElement) {
      const div = document.createElement('div');

      // Remove text node from parser and append it to div
      div.appendChild(node);
      nextNode.replaceWith(div);

      // Skip next node (i.e. <br>)
      index++;
    }
  }

  return parser.innerHTML;
}

Try it:

console.config({ maximize: true });

function replaceTextBrWithDiv(html) {
  const parser = document.createElement('div');
  parser.innerHTML = html;

  parser.childNodes.forEach((node, index, nodes) => {
    const nextNode = nodes[index + 1];
    
    if (node instanceof Text && nextNode instanceof HTMLBRElement) {
      const div = document.createElement('div');
      div.appendChild(node);
      nextNode.replaceWith(div);
    }
  });
  
  return parser.innerHTML;
}

const content = 'text<br>text<br><ul><li>text</li></ul><br>';

console.log(replaceTextBrWithDiv(content));
<script src="https://gh-canon.github.io/stack-snippet-console/console.min.js"></script>
InSync
  • 4,851
  • 4
  • 8
  • 30
  • Thank's a lot for your solution. Actually I've already tested a similiar approach using a container as parser. I'll post it as an answer but it get's out of topic. I should have asked differently: regex for replacing a string that ends with the word
    and does not start with the word
      – mbe May 08 '23 at 20:28
    • "While arbitrary HTML with only a regex is impossible, it's sometimes appropriate to use them for parsing a limited, known set of HTML." – mbe May 08 '23 at 20:31
    • Generally, don't. The accepted and most-upvoted answer has its fame for a reason. – InSync May 08 '23 at 21:35
    • there's no accepted and upvoted answer so far ;-) – mbe May 08 '23 at 22:00
    • By "*[t]he accepted and most-upvoted answer*" I mean [this](https://stackoverflow.com/a/1732454). If you decide to ignore it, good for you then, I guess. – InSync May 08 '23 at 22:01
    • ah you referred on the other post. Sorry for that one. – mbe May 08 '23 at 22:38
    -1

    If you prefer using regex use /(?!<li.?>)(?!</li>)(?!<ul.?>)(?!</ul>)(.?)
    /g*

    const html = 'text<br>text<br><ul><li>text</li></ul><br>';
    const regex = /(?!<li.*?>)(?!<\/li>)(?!<ul.*?>)(?!<\/ul>)(.*?)<br>/g;
    const replacedHtml = html.replace(regex, '<div>$1</div>');
    console.log(replacedHtml);
    

    But it would be better if you switch to HTML parser to easily navigate and modify the structure of the HTML content

    TaigaHyaga
    • 40
    • 6
    • 1
      I was expecting comments like use an HTML-Parser. ;-) Since the string is delievered as text and I only need to replace the
      a regex solution is short and simple. Anyway, your regex doesn't seem to be working: console.log:
      text
      text
      <
      ul>
    • text
    – mbe May 08 '23 at 11:56