0

I want to make a data scraping python program that gets from the user an input (the zipcode) and prints the Total population, Housing units, Land area, Density, and Water area for this site:http://www.uszip.com/zip/ . I am using regular expressions, but I got stuck with some html tags. I need the data contained each time, for example 10.13:

Land area<br><span class="stype">(sq. miles)</span></dt><dd>10.13</dd><dt>
Density<br><span class="stype">(people per sq. mile)</span></dt><dd>2,146.20<span class="trend trend-up" title="+34 (+1.59% since 2000)">▲</span></dd><dt>
Water area<br><span class="stype">(sq. miles)</span></dt><dd>0.06</dd>

I am thinking something like this:

Land area<br><span class="stype">(sq. miles)</span></dt><dd>(.*?)</dd><dt>

Any ideas or other ways to implement this?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
mikaeloN
  • 11
  • 1
  • 5
  • It's not a good idea to use [regex to parse HTML](http://stackoverflow.com/a/1732454/4014959); you can use something like [Beautiful Soup](http://www.crummy.com/software/BeautifulSoup) instead. Or even better: use the [uszip API](http://www.uszip.com/api.php) – PM 2Ring Oct 14 '15 at 12:19
  • i am almost done this why i keep trying with regex!:/ for some weird reason i cant sign up for the uszip api... – mikaeloN Oct 14 '15 at 13:06

0 Answers0