-2

I know I should not use regex in HTML. I would like to extract image sources from an HTML file, example below:

It might look like this:

<img src = cid:header width="700" height="93" alt="Logo" />
<img src= cid:header width="700" height="93" alt="Logo" />
<img src =cid:header width="700" height="93" alt="Logo" />
<img src=cid:header width="700" height="93" alt="Logo" />

In each case, I'ld like to get "cid:header" as the result.

Since my regex knowledge is basically zero, I turn to you guys. I need a pattern that accepts a space after "src" or after the "=" character.

src[mightBeSpace]=[mightBeSpace]cid:[mustNotBeSpace]

Thank you!

Dominik Antal
  • 3,281
  • 4
  • 34
  • 50

2 Answers2

2
^<img src\s?=\s?([^\s]+).*/>$
rbedger
  • 1,177
  • 9
  • 20
2

"might be space" in regex is \s*, and "must be no space" translates to \S+

Using this information you should be able to build a regex. If you can't, please show what you've tried.

Niet the Dark Absol
  • 320,036
  • 81
  • 464
  • 592