0

I have a list of some html links and want to scrape some data from these links using Beautiful Soup. All pages have same DOM structure:

enter image description here

I want to extract highlighted piece of data (in this case it is senior) but I don't know what to do next with my code:

for link in links:
    response = requests.get(link)
    soup = BeautifulSoup(response.content, 'html.parser')
beginsql
  • 135
  • 7
  • Does this answer your question? [How to find elements by class](https://stackoverflow.com/questions/5041008/how-to-find-elements-by-class) – bertdida Aug 23 '20 at 13:19
  • Unfortunately .findAll doesn't work. It returns me an error: 'NoneType' object is not callable. If I use .findall it returns me an empty list. I've used soup.find_all("div", {"class": "css-1ji7bvd"}) – beginsql Aug 23 '20 at 15:00
  • Can you share the URL? – Andrej Kesely Aug 23 '20 at 16:19
  • https://justjoin.it/offers/ulam-labs-frontend-developer I did some research and I suspect the only way to extract some data is to use some kind of headless browser like chromium for example. It's because each website is heavy in javascript but to be certain I need to hear this from someone who is more experienced in this matter – beginsql Aug 23 '20 at 16:34

1 Answers1

0

The mentioned items are not located in the HTML source code. The webpage is run by JavaScript script. Therefore, you cannot find the mentioned div tag. I recommend to use selenium to grab your required data by class name or Xpath. please refer to below sample code.

from selenium import webdriver

url = "https://justjoin.it/offers/ulam-labs-frontend-developer"
driver = webdriver.Firefox()
driver.get(url)

div_element = driver.find_element_by_xpath("/html/body/div[1]/div[3]/div[1]/div/div[2]/div[1]/div[2]/div[4]/div[2]")
print(div_element.text)
driver.close()
Pooria_T
  • 136
  • 1
  • 7