0

I have a regex that grabs info from a reference site. Some of the facts have links and it doesn't grab the link. How can I achieve this?

The URL is  http://www.nationalpastime.com/

Here is my regex.

(?<=<td width="400">\s)[^<]+
hwnd
  • 69,796
  • 4
  • 95
  • 132
  • 3
    Requisite "parsing HTML with regex" article: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – jwatts1980 Jun 03 '15 at 21:06

1 Answers1

0

Use ([^<]|<[^\/]|<\/[^t]|<\/t[^d]|<\/td[^>])+ instead of [^<]

lmcarreiro
  • 5,312
  • 7
  • 36
  • 63