0

So I got the following input inside my textarea element:

<quote>hey</quote>

what's up?

I want to separate the text between the <quote> and </quote> ( so the result would be 'hey' and nothing else in this case.

I tried with .replace and the following regular expression, but it did not achieve the right result and I can't see why:

quoteContent = value.replace(/<quote>|<\/quote>.*/gi, ''); (the result is 'hey what's up'it doesn't remove the last part, in this case 'what's up', it only removes the quote marks)

Does someone know how to solve this?

tilly
  • 2,229
  • 9
  • 34
  • 64
  • `/(?<=).*?(?=<\/quote>)/.exec(yourString)` – Matt Burland Jun 14 '18 at 19:12
  • in ESNext, with the [`s` mode](https://github.com/tc39/proposal-regexp-dotall-flag). Before that, use `[\s\S]` or similar to match everything. – ASDFGerte Jun 14 '18 at 19:12
  • 2
    obligatory https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa – epascarello Jun 14 '18 at 19:15

4 Answers4

3

Even if it's only a small html snippet, don't use regex to do any html parsing. Instead, take the value, use DOM methods and extract the text from an element. A bit more code, but the better and safer way to do that:

const el = document.getElementById('foo');
const tmp = document.createElement('template');
tmp.innerHTML = el.value;
console.log(tmp.content.querySelector('quote').innerText);
<textarea id="foo">
<quote>hey</quote>

what's up?
</textarea>
baao
  • 71,625
  • 17
  • 143
  • 203
  • 1
    I see, but here it is more some random string I made to create a forum reply instead of an actual element. – tilly Jun 14 '18 at 19:17
  • 1
    You're absolutely free to use an regex approach for your html parsing. Just don't wonder if your code breaks later; that's why I posted the correct approach @tilly. Changing the selector in getElementsByTagName is even much easier and more dynamic than the regex selector – baao Jun 14 '18 at 19:18
  • 2
    And don't use `.innerHTML` on anything other than a ` – Mike Samuel Jun 14 '18 at 19:27
1

You could also try using the match method:

quoteContent = value.match(/<quote>(.+)<\/quote>/)[1];

clarmond
  • 358
  • 1
  • 7
1

You should try to avoid parsing HTML using regular expressions.

<quote><!-- parsing HTML is hard when </quote> can appear in a comment -->hey</quote>

You can just use the DOM to do it for you.

// Parse your fragment
let doc = new DOMParser().parseFromString(
    '<quote>hey</quote>\nWhat\'s up?', 'text/html')
// Use DOM lookup to find a <quote> element and get its
// text content.
let { textContent } = doc.getElementsByTagName('quote')[0]
// We get plain text and don't need to worry about "&lt;"s
textContent === 'hey'
Mike Samuel
  • 118,113
  • 30
  • 216
  • 245
-1

The dot . will not match new lines.

Try this:

//(.|\n)* will match anything OR a line break
quoteContent = value.replace(/<quote>|<\/quote>(.|\n)*/gi, '');
scunliffe
  • 62,582
  • 25
  • 126
  • 161