Web scraping with python newbie

Question

I'm just learning about python & web scraping, I'm trying to scrape sectional times from attheraces & I can get the data into a spreadsheet but it is all vertical & I want to get it as a horizontal table (like dispalyed on the website), so far I have this...

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = "http://www.attheraces.com/ajax/getContent.aspx?ctype=sectionalsracecardresult&raceid=1062194&page=/racecard/Windsor/8-October-2018/1325&dtype=times"

uClient = uReq (my_url)
page_html =uClient.read()
uClient.close()

page_soup=soup(page_html, "html.parser")

containers = page_soup.findAll ("div",{"class":"card-body__td card-body__td--centred card-cell__time card-cell__time--8-sectionals"})

filename = "sectionals.csv"
f= open (filename, "w")

headers = "sectional\n"

f.write(headers)

for container in containers:
    sectional = container.div.div.span.text

    print(sectional)


    f.write(sectional + "," + "\n")

f.close()

score 0 · Answer 1 · answered Oct 11 '18 at 05:01

If you go directly to the cells, you'll have to make assumptions about the rows. Start with the rows:

containers = page_soup.findAll("div", {"class":"card-cell card-cell--primary card-cell--primary--no-only"})

# Open a file handle here and use it to create a csv writer (I like to use DictWriter).

for container in containers:
    row = []

    for cell in container.findAll("div", {"class":"card-body__td card-body__td--centred card-cell__time card-cell__time--8-sectionals"}):
        sectional = cell.div.div.span.text
        row.append(sectional)

    # Write a row to your csv writer here.
    print(row)

Look into using Python's csv module to avoid common issues. Also, the with syntax is a great way to make sure your resource management is correct; csv supports this, as do files (with open('...', 'r') as:), and these can be used together.

Web scraping with python newbie

1 Answers1