I am new to Python and fairly new to programming in general. I'm trying to work out a script that uses BeautifulSoup to parse https://www.state.nj.us/mvc/ for any text that's red. The table I'm looking at is relatively simple HTML:
<html>
<body>
<div class="alert alert-warning alert-dismissable" role="alert">
<div class="table-responsive">
<table class="table table-sm" align="center" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td width="24%">
<strong>
<font color="red">Bakers Basin</font>
</strong>
</td>
<td width="24%">
<strong>Oakland</strong>
</td>
...
...
...
</tr>
</tbody>
</table>
</div>
</div>
</body>
</html>
From the above I want to find Bakers Basin, but not Oakland, for example.
Here's the Python I've written (adapted from Cory Althoff The Self-Taught Programmer, 2017, Triangle Connection LCC):
import urllib.request
from bs4 import BeautifulSoup
class Scraper:
def __init__(self, site):
self.site = site
def scrape(self):
r = urllib.request.urlopen(self.site)
html = r.read()
parser = "html.parser"
soup = BeautifulSoup(html, parser)
tabledmv = soup.find_all("font color=\"red\"")
for tag in tabledmv:
print("\n" + tabledmv.get_text())
website = "https://www.state.nj.us/mvc/"
Scraper(website).scrape()
I seem to be missing something here though because I can't seem to get this to scrape through the table and return anything useful. The end result is I want to add the time module and run this every X minutes, then to have it log a message somewhere for when each site goes red. (This is all so my wife can figure out the least crowded DMV to go to in New Jersey!).
Any help or guidance is much appreciated on getting the BeautifulSoup bit working.