Im trying to grab proxies from a site using python by scanning through the page with urlib and finding proxies using regex.
A proxy on the page looks something like this:
<a href="/ip/190.207.169.184/free_Venezuela_proxy_servers_VE_Venezuela">190.207.169.184</a></td><td>8080</td><td>
My code looks like this:
for site in sites:
content = urllib.urlopen(site).read()
e = re.findall("\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\<\/\a\>\<\/td\>\<td\>\d+", content)
#\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d+
for proxy in e:
s.append(proxy)
amount += 1
Regex:
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\<\/\a\>\<\/td\>\<td\>\d+
I know that the code works but that the Regex is wrong.
Any idea on how I could fix this?
EDIT: http://www.regexr.com/ seems to thing my Regex is fine?