Extracting data from nested html code with Beautiful soup

Question

I have a list of some html links and want to scrape some data from these links using Beautiful Soup. All pages have same DOM structure:

I want to extract highlighted piece of data (in this case it is senior) but I don't know what to do next with my code:

for link in links:
    response = requests.get(link)
    soup = BeautifulSoup(response.content, 'html.parser')

Does this answer your question? [How to find elements by class](https://stackoverflow.com/questions/5041008/how-to-find-elements-by-class) — bertdida, Aug 23 '20 at 13:19
Unfortunately .findAll doesn't work. It returns me an error: 'NoneType' object is not callable. If I use .findall it returns me an empty list. I've used soup.find_all("div", {"class": "css-1ji7bvd"}) — beginsql, Aug 23 '20 at 15:00
https://justjoin.it/offers/ulam-labs-frontend-developer I did some research and I suspect the only way to extract some data is to use some kind of headless browser like chromium for example. It's because each website is heavy in javascript but to be certain I need to hear this from someone who is more experienced in this matter — beginsql, Aug 23 '20 at 16:34

score 0 · Answer 1 · answered Aug 24 '20 at 08:08

The mentioned items are not located in the HTML source code. The webpage is run by JavaScript script. Therefore, you cannot find the mentioned div tag. I recommend to use selenium to grab your required data by class name or Xpath. please refer to below sample code.

from selenium import webdriver

url = "https://justjoin.it/offers/ulam-labs-frontend-developer"
driver = webdriver.Firefox()
driver.get(url)

div_element = driver.find_element_by_xpath("/html/body/div[1]/div[3]/div[1]/div/div[2]/div[1]/div[2]/div[4]/div[2]")
print(div_element.text)
driver.close()

Extracting data from nested html code with Beautiful soup

1 Answers1