0

I have a string containing HTML code which may contain multiple images. I need to extract all the image references from the string and dynamically replace the src attributes.

I have been able to extract all the image elements using regex, and for each identified image element I can parse the name and src attributes, but I'm not sure how to go about replacing the src attribute and then pushing it back in the original string.

// Define html string with images
const htmlString = `<p>Here's an image of people:</p><p><br></p><img class="projectImage" src="http://localhost:9199/v0/b/myApp-test.appspot.com/o/images%2Fprojects%2FbqIFfaNFV8SqO3rn0GRH%2Fdraft%2Fpeople-3137672_1920.jpeg?alt=media&amp;token=1369544e-abd0-4b53-a37a-bf325013dcb7" name="people-3137672_1920.jpeg"><p><br></p><p><br></p><p>and here's an image of some dogs:</p><p><br></p><img class="projectImage" src="http://localhost:9199/v0/b/myApp-test.appspot.com/o/images%2Fprojects%2FbqIFfaNFV8SqO3rn0GRH%2Fdraft%2Fdogs.webp?alt=media&amp;token=1c93469a-0537-4a43-9387-13f0bf8d64c9" name="dogs.webp"><p><br></p>`

// Generate list of images
const imageElements = htmlString.match(/<img[\w\W]+?\/?>/g)
console.log(`found ${imageElements.length} image element(s):`)

// For each image identify the name and src attributes
imageElements.forEach(imageElement => {
  const regex = /<img[^>]*?src=["\']?((?:.(?!\1|>))*.?)"([^"]*).*name=["\']?((?:.(?!\1|>))*.?)"([^"]*)/
  const match = regex.exec(imageElement)
  const src = match[2]
  const name = match[4]

  console.log({ name, src })
  
  // if(name === 'dogs.web') { replaceSrc('https://newLink.com) ??? }
})

How can I replace the src attributes and push the updated image component back into the original html string (or into a modified cloned html string)?

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Sam
  • 1,130
  • 12
  • 36
  • Can't you use a proper HTML parser? – Luatic May 28 '22 at 13:49
  • I think I should. I haven't heard about them prior to your comment but having a brief look, it seems like something that would significantly reduce complexity from my original approach of manually parsing html via regex. If you think you can help come up with a specific solution using one, please feel free to put it down as an answer. – Sam May 28 '22 at 13:54

1 Answers1

3

Like @LMD suggested in the comment, just use the DOM for this.

let el = document.createElement('div');
el.innerHTML = `<p>Here's an image of people:</p><p><br></p><img class="projectImage" src="http://localhost:9199/v0/b/myApp-test.appspot.com/o/images%2Fprojects%2FbqIFfaNFV8SqO3rn0GRH%2Fdraft%2Fpeople-3137672_1920.jpeg?alt=media&amp;token=1369544e-abd0-4b53-a37a-bf325013dcb7" name="people-3137672_1920.jpeg"><p><br></p><p><br></p><p>and here's an image of some dogs:</p><p><br></p><img class="projectImage" src="http://localhost:9199/v0/b/myApp-test.appspot.com/o/images%2Fprojects%2FbqIFfaNFV8SqO3rn0GRH%2Fdraft%2Fdogs.webp?alt=media&amp;token=1c93469a-0537-4a43-9387-13f0bf8d64c9" name="dogs.webp"><p><br></p>`

el.querySelectorAll('img').forEach((imgEl) => {
  imgEl.src = 'my.newSource.com/img/1';
});

console.log(el.innerHTML);
Laisender
  • 714
  • 1
  • 5
  • 10