background: i'm trying to scrape some tables from this pro-football-reference page. I'm a complete newbie to Python, so a lot of the technical jargon ends up lost on me but in trying to understand how to solve the issue, i can't figure it out.
specific issue: because there are multiple tables on the page, i can't figure out how to get python to target the one i want. I'm trying to get the Defense & Fumbles table. The code below is what i've got so far, and it's from this tutorial using a page from the same site- but one that only has a single table.
sample code:
#url we are scraping
url = "https://www.pro-football-reference.com/teams/nwe/2017.htm"
#html from the given url
html=urlopen(url)
# make soup object of html
soup = BeautifulSoup(html)
# we see that soup is a beautifulsoup object
type(soup)
#
column_headers = [th.getText() for th in
soup.findAll('table', {"id": "defense").findAll('th')]
column_headers #our column headers
attempts made: I realized that the tutorial's method would not work for me, so i attempted to change the soup.findAll portion to target the specific table. But i repeatedly get an error saying:
AttributeError: ResultSet object has no attribute 'findAll'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
when changing it to find, the error becomes:
AttributeError: 'NoneType' object has no attribute 'find'
I'll be absolutely honest that i have no idea what i'm doing or what these mean. I'd appreciate any help in figuring how to target that data and then scrape it.
Thank you,