EDIT:
So I found a way to do it by clicking on the Countries elements, see my answer.
Still have one question that would make this better:
When I execute the scrollIntoView(true) on a country <li>
it goes under another element (<div class="sportList_subtitle">Desportos</div>
) and is not clickable.
Is there some javascript or selenium function like "scrollIntoClickable"?
ORIGINAL:
I'm trying to scrape info from Betclic website with python and BeautifulSoup + Selenium.
Given the URL for each game has the structure: "domain"/"sports_url"/"competition_url"/"match_url"
Example: https://www.betclic.pt/futebol-s1/liga-dos-campeoes-c8/rennes-chelsea-m2695669
You can try it in your language, they translate the actual URL string but the structure and ID's are the same.
The only thing that's left is grabbing all the different "competition_url"
So my question now is from the "sports_url" (https://www.betclic.pt/futebol-s1) how can I get all sub "competition_url"?
The problem is with the "hidden" URL's under each country's name on the left panel. Those only appear after you click the arrow next to each country's name, like a drop-down list. The click event actually adds one class "is-active" to the <li>
for that country and also
an <ul>
at the end of that <li>
. It's this added <ul>
that has the URL's list I'm trying to get.
Code before click:
<!---->
<li class="sportList_item has-children ng-star-inserted" routerlinkactive="active-link" id="rziat-DE">
<div class="sportList_itemWrapper prebootFreeze">
<div class="sportlist_icon flagsIconBg is-DE"></div>
<div class="sportlist_name">Alemanha</div>
</div>
<!---->
</li>
Code after click (reduced for presentation):
<li class="sportList_item has-children ng-star-inserted is-active" routerlinkactive="active-link" id="rziat-DE">
<div class="sportList_itemWrapper prebootFreeze">
<div class="sportlist_icon flagsIconBg is-DE"></div>
<div class="sportlist_name">Alemanha</div>
</div>
<!---->
<ul class="sportList_listLv2 ng-star-inserted">
<!---->
<li class="sportList_item ng-star-inserted" routerlinkactive="active-link">
<a class="sportList_itemWrapper prebootFreeze" id="competition-link-5" href="/futebol-s1/alemanha-bundesliga-c5">
<div class="sportlist_icon"></div>
<div class="sportlist_name">Alemanha - Bundesliga</div>
</a>
</li>(...)
</li>(...)
</li>(...)
</li>
</ul>
</li>
In this example is that "/futebol-s1/alemanha-bundesliga-c5" that I'm looking for.
Is there a way to get all those URL's? Or the "hiden" <ul>
for that matter?
Maybe a way to simulate the click and parse the HTML code again?
Thanks in advance!