I am trying to learn data scraping using python and have been using the Requests and BeautifulSoup4 libraries. It works well for normal html websites. But when I tried to get some data out of websites where the data loads after some delay, I found that I get an empty value. An example would be
from bs4 import BeautifulSoup
from operator import itemgetter
from selenium import webdriver
url = "https://www.example.com/;1"
browser = webdriver.PhantomJS()
browser.get(url)
html = browser.page_source
soup = BeautifulSoup(html, 'lxml')
a = soup.find('span', 'buy')
print(a)
I am trying to grab the from here: (value)
I have already referred a similar topic and tried executing my code on similar lines as the solution provided here. But somehow it doesnt seem to work. I am a novice here so need help getting this work. How to scrape html table only after data loads using Python Requests?
The table (content) is probably generated by JavaScript and thus can't be "seen". I am using python3.6 / PhantomJS / Selenium as proposed by a lot of answers here.