I am currently trying to scrape my internet providers data usage. I tried looking for an api of sorts but they don't have one. I am resorting to scraping the html whch looks like this
</tr><tr class="top-border"><td>17 Monday</td><td class='text-right'><span class='mb'>2,991.69 MB</span><span class='gb'>2.92 GB</span></td></td><td class='text-right'><span class='mb'>1,232.04 MB</span><span class='gb'>1.20 GB</span></td></td><td class='text-right'><span class='mb'>4,223.73 MB</span><span class='gb'>4.12 GB</span></td> <td>
<div class="progress"><div class="bar bar-success" style="width: 51%;"></div></div> </td>
</tr><tr><td>18 Tuesday</td><td class='text-right'><span class='mb'>3,589.42 MB</span><span class='gb'>3.51 GB</span></td></td><td class='text-right'><span class='mb'>1,199.58 MB</span><span class='gb'>1.17 GB</span></td></td><td class='text-right'><span class='mb'>4,789.00 MB</span><span class='gb'>4.68 GB</span></td> <td>
<div class="progress"><div class="bar bar-success" style="width: 57%;"></div></div> </td>
ect
I tried to use pythons re.search but I can only get a bit of info out of it. eg:
search = re.search("class='gb'>(.*) GB</span>",raw_info)
for i in range(0,100):
try:
print(search.group(i))
except:
break
output:
class='gb'>6.88 GB</span></td></td><td class='text-right'><span class='mb'>
1,295.90 MB</span><span class='gb'>1.27 GB</span></td></td><td class='
text-right'><span class='mb'>8,340.12 MB</span><span class='gb'>8.14 G
B</span>
6.88 GB</span></td></td><td class='text-right'><span class='mb'>1,295.90&nb
sp;MB</span><span class='gb'>1.27 GB</span></td></td><td class='text-right'
><span class='mb'>8,340.12 MB</span><span class='gb'>8.14
I learned I can't use groups like that to print out all of the numbers
tldr: I need to print all the numbers referring to gb and print them like this
2.92,1.20,4.12
3.51,1.17,4.68