I am cleaning and migrating content to a new website. In some of the existing pages there are embedded images that link to files in a non-standard folder.
I am pulling the records from the database and then doing a "preg_match_all" to capture the offending items. My intention is then to clean up the filename, move the offending file and then update the database entry to reflect the new location.
However, for some reason my regex statement seems to be finding only one match (of known multiple potential hits), and sometimes seems to capture a whole load of other stuff downstream of the string I want.
This is the expression pattern I am using:
(?i)(<img.*src="uploads/RTEmagicC_(.*)")/
This is an example of content from the database that I am matching against:
BLAH BLAH BLAH<img src="uploads/RTEmagicC_Herpes_simpex_virus.jpg.jpg" alt="HSV particles" style="FLOAT: left; WIDTH: 214px; HEIGHT: 198px" title="Electron micrograph of HSV particles©NASA">blah blah blah<img src="uploads/RTEmagicC_Herpes_labialis_01.jpg.jpg" alt="Coldsore" style="FLOAT: right;" title="Cold sore on the lower lip (cluster of fluid-filled blisters = very infectious). These infections may appear on the lips, nose or in surrounding areas.©Metju12" width="238" height="178">blah blah blah
I am trying to grab:
"Herpes_simpex_virus.jpg.jpg"
and "Herpes_labialis_01.jpg.jpg"
and the respective full links e.g.:
"img src="uploads/RTEmagicC_Herpes_simpex_virus.jpg.jpg"
But it's matching a heap of downstream stuff too, beyond the "
that closes the filename.
Can someone please put me out of my misery? I've tried for a few evenings on this and clearly I'm doing something stupid, but I cannot see what...
Many thanks.