Looking for a way to exclude image links/links that do not contain any anchor text. The code below gets the job done as far as compiling the data I want, but it also picks up unwanted URLs from some thumbnails/image links on the pages
for url in list_urls:
browser.get(url)
soup = BeautifulSoup(browser.page_source,"html.parser")
for line in soup.find_all('a'):
href = line.get('href')
links_with_text.append([url, href])
Images on the pages scraped all have the same format (and they are all under the same div class, "related-content"):
<a href="https://XXXX/" ><picture class="crp_thumb crp_featured" title="XXXX">
<source type="image/webp" srcset="https://XXXX.jpg.webp"/>
<img width="150" height="150" src="https://XXXX.jpg" alt="XXXX"/>
</picture>