If you want to extract all visible numbers from the HTML document, you can first use BeautifulSoup to parse the HTML document, and extract the text from it. And after that, you can extract all the numbers from those text elements:
from bs4 import BeautifulSoup
from urllib.request import urlopen
import re
# let’s use the StackOverflow homepage as an example
r = urlopen('http://stackoverflow.com')
soup = BeautifulSoup(r)
# As we don’t want to get the content from script related
# elements, remove those.
for script in soup(['script', 'noscript']):
script.extract()
# And now extract the numbers using regular expressions from
# all text nodes we can find in the (remaining) document.
numbers = [n for t in soup(text=True) for n in re.findall('\d+', t)]
numbers
will then contain all the numbers that were visible in the document. If you want to restrict the search to only certain elements, you can change the soup(text=True)
part.