-2

I am having a hard time getting the right regex for the following

What I want: Three matches (John Doe, , Jane Doe)

The problems are the optional anchor tag and that there can be empty results.

String to search:

<td class="character">
  <a href=""> John Doe </a>

</td>
<td class="character">

</td>
<td class="character">
  Jane Doe 

</td>

My regex so far: @<td class="character">.*?(?:<a.*?>)?(.*?)(?:</a>)?.*?</td>@gms

Link to regex101 https://regex101.com/r/9NRhjI/1

I know you shouldn't use regex to parse xml/html, but as I only use it to dig through a tiny subset of html it should be possible, right?

1 Answers1

0

You could use this simple regex:

@<td[^>]+>\s+(?:<a[^>]+>)?\s+([^<]+?)\s+(?:</a>)?\s+</td>@gms

DEMO

Wololo
  • 841
  • 8
  • 20