1

I'm trying to extract texts from the forum website, it works good but if there are 2 lines in one comment it extracts the first line in the comment. see examples below

<div class="wwCommentBody">             
   <blockquote class="postcontent restore " style="padding: 10px;">Happy birthday bro! <br>
    Have a nice day <img src="images/emoji/smile.png" border="0" alt="" title="Smile" 
    class="inlineimg"> 
     </blockquote>            
</div>
r = requests.get("https://example.com/threads/73956/page2", headers=headers, cookies=cookies)
soup = BeautifulSoup(r.content, "html.parser")
comments = soup.find_all('div',{'class':'wwCommentBody'})
for div in comments:
    text = (div.find('blockquote',{'class':'postcontent restore'}))
    first_child = next(text.children, None)
    if first_child is not None:
        print(first_child.string.strip())
Marvel
  • 29
  • 5
  • Does this answer your question? [BeautifulSoup findAll() given multiple classes?](https://stackoverflow.com/questions/18725760/beautifulsoup-findall-given-multiple-classes) – Maurice Meyer Jun 28 '21 at 13:54
  • @MauriceMeyer I didn't find a solution yet – Marvel Jun 28 '21 at 14:12

1 Answers1

2

Just extract the blockquote and print it's text.

for div in comments:
    bq = div.find('blockquote',{'class':'postcontent restore'})
    print(bq.text)
Ram
  • 4,724
  • 2
  • 14
  • 22
  • Thank you so much! It was so easy to do it. I have a last little question, please. if I want to get a specific text only from the comment and print it how can I do it? – Marvel Jun 28 '21 at 15:12
  • Like ? Could you provide an example. – Ram Jun 28 '21 at 15:40
  • Like if there's a post for example `( hello my name is Ram from StackOverflow )` I want only the Ram to StackOverflow text and will be like this if I print it `( Ram from StackOverflow )` – Marvel Jun 28 '21 at 15:44