I am working on building a job board which involves scraping job data from company sites. I am currently trying to scrape Twilio at https://www.twilio.com/company/jobs. However, I am not getting the job data its self -- that seems to be being missed by the scraper. Based on other questions this could be because the data is in JavaScript, but that is not obvious.
Here is the code I am using:
# Set the URL you want to webscrape from
url = 'https://www.twilio.com/company/jobs'
# Connect to the URL
response = requests.get(url)
if "_job-title" in response.text:
print "Found the jobs!" # FAILS
# Parse HTML and save to BeautifulSoup object
soup = BeautifulSoup(response.text, "html.parser")
# To download the whole data set, let's do a for loop through all a tags
for i in range(0,len(soup.findAll('a', class_='_job'))): # href=True))): #'a' tags are for links
one_a_tag = soup.findAll('a', class_='_job')[i]
link = one_a_tag['href']
print link # FAILS
Nothing displays when this code is run. I have tried using urllib2 as well and that has the same problem. Selenium works but it is too slow for the job. Scrapy looks like it could be promising but I am having install issues with it.