I have a question about BeautifulSoup in Python 3.I spent a couple of hours to try but I have not solved it yet.
This is my soup:
print(soup.prettify())
# REMEMBER THIS SOUP IS DYNAMIC
# <html>
# <body>
# <div class="title" itemtype="http://schema.org/FoodEstablishment">
# <div class="address" itemtype="http://schema.org/PostalAddress">
# <div class="address-inset">
# <p itemprop="name">33 San Francisco</p>
# </div>
# </div>
# <div class="image">
# <img src=""/>
# <span class="subtitle">image subtitle</p>
# </div>
# <a itemprop="name">The Dormouse's story</a>
# </div>
# </body>
# </html>
I have to extract two text by itemprop="name"
: The Dormouse's story
and 33 San Francisco
But I want need way to define what class is the parent.
Expected output:
{
"FoodEstablishment": "The Dormouse's story",
"PostalAddress": "33 San Francisco"
}
Remember the soup is always dynamic and have many chilren elements in it.