I'm running a python selenium script in a lambda function on AWS.
I'm scraping this page: Link
The scraper itself is working fine. But the pagination to the next page stopped working. It worked before for many months.
I exported a screenshot via:
png = driver.get_screenshot_as_base64()
It shows this page instead of the second page:
I run this code (simplified version):
while url:
driver.get(url)
png = driver.get_screenshot_as_base64()
print(png)
button_next = driver.find_elements_by_class_name("PaginationArrowLink-sc-imp866-0")
print("button_next_url: " + str(button_next[-1].get_attribute("href")))
try:
url = button_next[-1].get_attribute("href")
except:
url=""
print('Error in URL')
The interesting thing is the printed URL is totally fine and when I open it manually in the browser it loads page 2:
https://www.stepstone.de/5/ergebnisliste.html?what=Berufskraftfahrer&searchorigin=Resultlist_top-search&suid=1faad076-5348-48d8-9834-4e0d9a836e34&of=25&action=paging_next
But "driver.get(url)" leads to the error page on the screenshot.
Is this some sort of scrape protection from the website? Or is there another reason it sopped working from one day to the other?