I am currently trying to scrape a job website (jobs.at). In the code below I am looking for the names of the job results and then safe them in a dictionary. The code below works for the first 15 results. The problem is that after every 15th search result, the website posts an ad in between the job search results. The URL of the website is: https://www.jobs.at/j/personalverrechnung?dateFrom=all
The html code of the ad is the following:
<form method="POST" action="https://www.jobs.at/jobalarm" accept-charset="UTF-8" class="c-job-alarm-form j-c-card j-u-margin-bottom-xl j-u-overflow-hidden j-u-background-color-cyan-50" data-logged-in="false" novalidate data-form-name="job-alarm-form">…</form>
Can you think of any way to skip over this add and collect all search results?
jobs = []
for search_result in soup.find_all('div', class_="c-search-results"):
for job_name in soup.find_all("h2", class_="c-job-headline j-u-typo-m j-u-font-weight-bold j-u-margin-bottom-3xs"):
try:
job_name = job_name.a.text
except Exception as e:
job_name = None
jobs.append({'job_name': job_name})
print(jobs)
The url of the job search with the given job filter: https://www.jobs.at/j/personalverrechnung?dateFrom=all