problem to extract a text with Beautiful soup using python

Question

I'm trying to extract texts from the forum website, it works good but if there are 2 lines in one comment it extracts the first line in the comment. see examples below

<div class="wwCommentBody">             
   <blockquote class="postcontent restore " style="padding: 10px;">Happy birthday bro! <br>
    Have a nice day <img src="images/emoji/smile.png" border="0" alt="" title="Smile" 
    class="inlineimg"> 
     </blockquote>            
</div>

r = requests.get("https://example.com/threads/73956/page2", headers=headers, cookies=cookies)
soup = BeautifulSoup(r.content, "html.parser")
comments = soup.find_all('div',{'class':'wwCommentBody'})
for div in comments:
    text = (div.find('blockquote',{'class':'postcontent restore'}))
    first_child = next(text.children, None)
    if first_child is not None:
        print(first_child.string.strip())

Does this answer your question? [BeautifulSoup findAll() given multiple classes?](https://stackoverflow.com/questions/18725760/beautifulsoup-findall-given-multiple-classes) — Maurice Meyer, Jun 28 '21 at 13:54

score 2 · Accepted Answer · answered Jun 28 '21 at 14:25

2

Just extract the blockquote and print it's text.

for div in comments:
    bq = div.find('blockquote',{'class':'postcontent restore'})
    print(bq.text)

answered Jun 28 '21 at 14:25

Ram

4,724
2
14
22

Thank you so much! It was so easy to do it. I have a last little question, please. if I want to get a specific text only from the comment and print it how can I do it? – Marvel Jun 28 '21 at 15:12
Like ? Could you provide an example. – Ram Jun 28 '21 at 15:40
Like if there's a post for example `( hello my name is Ram from StackOverflow )` I want only the Ram to StackOverflow text and will be like this if I print it `( Ram from StackOverflow )` – Marvel Jun 28 '21 at 15:44

problem to extract a text with Beautiful soup using python

1 Answers1