I recently started to learn python. Now i want to strip numbers from a website to sum them up.
Here is my code:
# read data -> extract numbers -> compute sum
import urllib.request, urllib.parse
from bs4 import BeautifulSoup
html = urllib.request.urlopen('http://py4e-data.dr-chuck.net/comments_42.html')
file = BeautifulSoup(html, 'html.parser')
tags = file('span')
calcs = 0
for tag in tags:
tag.decode()
calcs += int(tag.string)
print(calcs)
In line 11 (calcs += ...) i wasn't sure what to do and somewhere in the internet i found .string, which helped me get the numbers out of the lines, but i'm not really sure why this works or what .string does. Couldn't find any source of information about that by myself. If i change .string to .int it gets 'None'
I hope anyone can explain me the use of .string.
Thank you in advance.