0

I have a webpage to fetch, when I fetch it with urllib and print the contents, I see the real content length, But after I parse the html with bs4, I see at least 5 blocks of divs are not included to bs4 parsed html, when I parse the html with beautifulsoup, I see the real content, and divs are included, I don't know where is the mistake, but all I see is, bs4 removes some of the divs that are needed by itself, how can I solve this issue ?, here is my sample,

#This one does not remove some neccessary parts, This is okay

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(urllib.urlopen("http://example").read())


#But this one removes some neccessary parts, This is not okay

from bs4 import BeautifulSoup
soup = BeautifulSoup(urllib.urlopen("http://example").read())

thank you

user2682790
  • 71
  • 1
  • 3

0 Answers0