I'm trying to get information from this URL:
I want to extract the text "Hot 8 Brass Band are a Grammy-nominated New Orleans based brass band, whose sound... "
etc.
My approach: I want to extract the info without using the explicit div name (since that tends to change.) So, I identify the "About Hot 8 Brass Band" using a variable, and then I want to access following-siblings and child divs, etc.
Code:
url = "https://www.bandsintown.com/e/1024477910-hot-8-brass-band-at-the-howlin'-wolf?came_from=253&utm_medium=web&utm_source=city_page&utm_campaign=event"
driver.get(url)
#Get artist
try:
artist = driver.find_elements_by_css_selector('a[href^="https://www.bandsintown.com/a/"] h1')
artist = artist[0].text
print(artist)
except (ElementNotVisibleException, NoSuchElementException, TimeoutException):
print ("artist doesn't exist")
#Get Bio Info
try:
readMoreBio = driver.find_element_by_xpath("//div[text()='Read More']").click()
print("Read More Bio Clicked")
except (ElementNotVisibleException, NoSuchElementException, TimeoutException):
pass
#Once read more clicked, get full bio info
try:
artistBioDiv = driver.find_elements_by_xpath("(//div[text()='About " + artist + "'])[0]/following-sibling/following-sibling::div")
print("artistBioDiv is: ", artistBioDiv)
except (ElementNotVisibleException, NoSuchElementException, TimeoutException):
print ("artist bio doesn't exist")
This just seems to access an empty element, i.e. it's not finding the bio paragraph.
Here's the HTML structure: