-1

I want to remove img tag from an HTML string, but when I used regexp to match the img tag with ">" in the attribute value of the img, it always match the wrong substring.

Here is a case:

var str = '<img src="test>test" alt="test > test test" > "some text';
var reg = ???

How to make '<img src="test>test" alt="test > test test" >">'.match(reg)[0] === '<img src="test>test" alt="test > test test" >'

Tried this regexp /<\s*(img)[^>]+(>[\r|\n|\s]*<\/\1>|\/?>)/, but not worked.

phd
  • 82,685
  • 13
  • 120
  • 165
Liang
  • 59
  • 1
  • 4
  • 1
    Does this answer your question? [Is it possible? Matching the exact same number of opening & closing braces](https://stackoverflow.com/questions/2415872/is-it-possible-matching-the-exact-same-number-of-opening-closing-braces) – cyberbrain Aug 03 '23 at 11:59
  • 2
    You don't, Use [DOMParser](https://developer.mozilla.org/en-US/docs/Web/API/DOMParser/parseFromString).... – epascarello Aug 03 '23 at 12:25
  • `const parser = new DOMParser(); const doc = parser.parseFromString(str, 'text/html'); console.log(doc.querySelector('img').outerHTML);` – epascarello Aug 03 '23 at 12:31
  • @cyberbrain there's no matching braces in the OP's case – Alexander Nenashev Aug 03 '23 at 14:01
  • @AlexanderNenashev braces, quotes, smaller/greater signs, where is the difference? – cyberbrain Aug 03 '23 at 14:55
  • @epascarello Sorry, I can't use DOM-related API here. – Liang Aug 04 '23 at 01:44
  • So if you are in node, you can use libraries that do it... https://stackoverflow.com/questions/11398419/trying-to-use-the-domparser-with-node-js You might just be tracking down edge cases with your regexp. – epascarello Aug 07 '23 at 01:49

1 Answers1

1

<img - starting a tag
[^">]* - any number before the first "
(?:"[^"]+"[^">]*)* - any number of " pairs with no " inside the pair plus any number of chars except " and > after the pair
> - closing

const tests = ['<img src="test>test" alt="test > test test" > "some text', '<img> <img>', `<img
src="test">`];

tests.forEach(str => console.log(str.match(/<img[^">]*(?:"[^"]+"[^">]*)*>/g)));
Alexander Nenashev
  • 8,775
  • 2
  • 6
  • 17