Using BeautifulSoup to extract part of class name

Question

Because the class names are changing for each item, I'd like to extract the information based on part of the class name (carrier-text in the example). However, it does not work...

html = """
<div class="dErF-carrier-text">
Alaska Airlines 398 </div>
"""

soup = BeautifulSoup(html, 'html.parser')
text = soup.find('div',class_="carrier-text").text
print(text)

I believe you are looking for this: https://stackoverflow.com/questions/34660417/beautiful-soup-if-class-contains-or-regex — bguest, Nov 28 '21 at 03:06

HedgeHog · Answer 1 · 2021-11-28T09:08:12.667

How to fix?

You can solve your issue by using css selectors that looks if class contains your substring:

soup.select_one('div[class*="carrier-text"]')

Please note

It will work for your specific example, but take care if there are elements with class that also contains your substring, then you may have to select more specific.

Options

Cause your question is not that clear - Extracting the text or the class?

Get the text

soup.select_one('div[class*="carrier-text"]').get_text(strip=True)

Get the class

soup.select_one('div[class*="carrier-text"]')['class']

Example

html = """
<div class="dErF-carrier-text">
Alaska Airlines 398 </div>
"""

soup = BeautifulSoup(html, 'html.parser')

soup.select_one('div[class*="carrier-text"]').get_text(strip=True)

Output

Alaska Airlines 398