0

I am trying to web scrape using beautifulsoup the first and second tags (-130, and +110) in this single HTML div (as seen below): example HTML

However I can not figure out how to scrape the second tag, can only scrape the first. Thank you.

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

day = "09"
month = "10"
year = "2017"
my_url = 'https://www.sportsbookreview.com/betting-odds/mlb-baseball/?date=' + year + month + day

# Opening up the connection and grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

# html parser
page_soup = soup(page_html, "html.parser")

allBovadaOdds = page_soup.find_all("div", {"rel": "999996"})

firstOdds = allBovadaOdds[1].b.string
print(firstOdds)
moghya
  • 854
  • 9
  • 18

2 Answers2

2

What you want can be written fairly simply, I think.

>>> import bs4
>>> import requests
>>> page = requests.get('https://www.sportsbookreview.com/betting-odds/mlb-baseball/?date=20171009').text
>>> soup = bs4.BeautifulSoup(page, 'lxml')
>>> soup.select('#eventLine-3330496-43 b')
[<b>-130</b>, <b>+110</b>]
>>> for item in soup.select('#eventLine-3330496-43 b'):
...     item.text
...     
'-130'
'+110'

However, I notice two potential problems:

  • The labelling of the elements (ie, ids of divs, etc) might vary from one invocation of the web page to the next.
  • There are actually two columns with this pair of values. It might be safer to identify the required items by using booking agent and number for instance.
Bill Bell
  • 21,021
  • 5
  • 43
  • 58
1

You may try to use soup.select() filter tags and use for i in range(): to get all of the second tags. Note that the step in range() should be 2.

# html parser
page_soup = soup(page_html, "html.parser")
allBovadaOdds = page_soup.select('div[rel="999996"] b')
print(allBovadaOdds)
for i in range(1,len(allBovadaOdds),2):
    SecondOdds = allBovadaOdds[i].string
    print(SecondOdds)
Kyle Hong
  • 11
  • 2