I am very new to programming so this can be a silly question.I wanted to learn to scrape web pages. so I learned BeautifulSoup to do it.....worked for few sites but got stuck on the following page
from bs4 import BeautifulSoup
import requests
r = requests.get("http://www.dlb.today/result/en")
data = r.text
soup = BeautifulSoup(data, "lxml")
data = soup.find_all("tbody", {"id": "pageData1"})
data2 = soup.find_all("ul", {"class": "res_allnumber"})
print data
print data2
#no point going further if I cant get raw data I think
this worked fine (a similar site I scraped)
r2 = requests.get("http://www.nlb.lk/results-more.php?id=1")
data2 = r2.text
soup2 = BeautifulSoup(data2, "lxml")
news2 = soup2.find_all("a", {"class": "lottery-numbers"})
#print news2 #(get raw Html for checking)
for draw_number in news2:
print draw_number.contents[0]
I couldn't scrape the table I wanted.so I tried LXML to do it...still no luck.............
#lxml
import requests
r = requests.get("http://www.dlb.today/result/en")
data = r.text
#print data
import lxml.html as LH
content = data
root = LH.fromstring(content)
for tag1 in root.xpath('//tbody[@class="pageData1"]//li'):
print tag1.text_content()
I don't know where is my error or what to do next......if anyone can anyone point me in the right direction I appreciate it !