I'm trying to get all the events and additional metadata to those events from this webpage : https://alando-palais.de/events
My problem is, that the result(html) doesn't contain the information I'm looking for. I guess, they are "hidden" behind some php script. This url: 'https://alando-palais.de/wp/wp-admin/admin-ajax.php'
Any idea, on how to wait until the page is completely loaded, or what kind of method do I have to use, to get the event information?
This is my script right now :-) :
from bs4 import BeautifulSoup
from urllib.request import urlopen, urljoin
from urllib.parse import urlparse
import re
import requests
if __name__ == '__main__':
target_url = 'https://alando-palais.de/events'
#target_url = 'https://alando-palais.de/wp/wp-admin/admin-ajax.php'
soup = BeautifulSoup(requests.get(target_url).text, 'html.parser')
print(soup)
links = soup.find_all('a', href=True)
for x,link in enumerate(links):
print(x, link['href'])
# for image in images:
# print(urljoin(target_url, image))
Expected output would be something like:
- Date: 08.03.2019
- Title: Penthouse Club Special: Maiwai & Friends
- img: https://alando-palais.de/wp/wp-content/uploads/2019/02/0803_MaiwaiFriends-500x281.jpg"
That's something out of this result:
<div class="vc_gitem-zone vc_gitem-zone-b vc_custom_1547045488900 originalbild vc-gitem-zone-height-mode-auto vc_gitem-is-link" style="background-image: url(https://alando-palais.de/wp/wp-content/uploads/2019/02/0803_MaiwaiFriends-500x281.jpg) !important;">
<a href="https://alando-palais.de/event/penthouse-club-special-maiwai-friends" title="Penthouse Club Special: Maiwai & Friends" class="vc_gitem-link vc-zone-link"></a> <img src="https://alando-palais.de/wp/wp-content/uploads/2019/02/0803_MaiwaiFriends-500x281.jpg" class="vc_gitem-zone-img" alt=""> <div class="vc_gitem-zone-mini">
<div class="vc_gitem_row vc_row vc_gitem-row-position-top"><div class="vc_col-sm-6 vc_gitem-col vc_gitem-col-align-left"> <div class="vc_gitem-post-meta-field-Datum eventdatum vc_gitem-align-left"> 08.03.2019
</div>