0

I am trying to scrape data from the American Kernel Club (https://www.akc.org/reg/dogreg_stats.cfm) and I have been having some trouble. I am referring to this stackoverflow post and I can get all of the rows on the second table but I cannot format them.

So here is my code.

from bs4 import BeautifulSoup
import requests
url = https://www.akc.org/reg/dogreg_stats.cfm
r. requests.get(r)
data= r.text
soup = BeautifulSoup(data)
rows = soup.find_all('table')[1].find_all('tr')

for row in rows:
    cells = soup.find_all('td')
    firstRanking = cell[1].get_text()
    print(firstRanking)

All it prints out is

More on Registration   Trends:
More on Registration   Trends:
More on Registration   Trends:
More on Registration   Trends:
More on Registration   Trends:
More on Registration   Trends:
More on Registration   Trends:

Instead of the actual rankings.

Community
  • 1
  • 1
Zaynaib Giwa
  • 5,366
  • 7
  • 21
  • 26

2 Answers2

1

When you create your variable "cells", you want to be finding all 'td' elements of that ROW, not of the entire "soup" object.

It should look like this:

cells = row.find_all('td')

Also, I believe there is an error in the line after this, it be "cells" not "cell" that is referenced:

firstRanking = cells[1].get_text()

This will make the for loop look like this:

for row in rows:
  cells = row.find_all('td')
  firstRanking = cells[1].get_text()
  print(firstRanking)
user3666197
  • 1
  • 6
  • 50
  • 92
JB333
  • 244
  • 1
  • 7
0

The main mistake I did was on this line rows = soup.find_all('table')[1].find_all('tr') <- this created a list item. To fix the problem I change the line to table= soup.find_all('table')[1] then rows=table.find_all('tr')

Zaynaib Giwa
  • 5,366
  • 7
  • 21
  • 26