0
from bs4 import BeautifulSoup
import requests
url_link = "https://en.wikipedia.org/wiki/List_of_states_and_territories_of_the_United_States"
source = requests.get(url_link).text
soup = BeautifulSoup(source, "lxml")
my_table = soup.find("table", class_ = "wikitable sortable plainrowheaders jquery-tablesorter")
from bs4 import BeautifulSoup
import requests
url_link = "https://en.wikipedia.org/wiki/List_of_states_and_territories_of_the_United_States"
source = requests.get(url_link).text
soup = BeautifulSoup(source, "lxml")
my_table = soup.find("table", class_ = "wikitable sortable plainrowheaders")

I was wondering whether anybody knows why the first block of code gives me a none object, but the second block of code returns what I want. Is it something to do with hyphenated class names? Thanks in advance:)!

  • This might be relevant: https://stackoverflow.com/questions/11047348/is-this-possible-to-load-the-page-after-the-javascript-execute-using-python – JonSG Mar 14 '23 at 15:13
  • Both examples will not produce a successful result, there is a typo in your code, so you should edit this. – HedgeHog Mar 14 '23 at 15:53

1 Answers1

1

No table in the HTML source (do a view source) has the class jquery-tablesorter and thus the first find() does not return any hits. Remember that BeautifulSoup does not run client side javascript the way Selenium does and thus classes like jquery-tablesorter that are applied by javascript libraries (or any other javascript functionality) will not be present to BeautifulSoup.

JonSG
  • 10,542
  • 2
  • 25
  • 36
  • But when I went to inspect the wikipedia page, I copied and pasted the class of the table exactly as "
    – Neha Rajput Mar 14 '23 at 15:14
  • "inspect" is a live version of the page and would have javascript processing. Try again but using "view source" – JonSG Mar 14 '23 at 15:15
  • Oh I see! Thank you so much. Do you know if there is a way to match the HTML source code lines to the parts of the website they correspond to with view source like you can do with inspect? – Neha Rajput Mar 14 '23 at 15:25
  • alas, no. I have always assumed the chrome view-source to just be the exact text returned "as-is" from the server. – JonSG Mar 14 '23 at 15:40