3

I am new to python and need some guidance on extracting values from specific cells from a HTML table.

The URL that I am working on can be found here

I am looking to get the first 5 values only in the Month and Settlement columns and subsequently display them as:

"MAR 14:426'6"

Problem that I am facing is:

  1. How do I get the loop to start from the 3rd "TR" in the table
  2. How to get only values for td[0] and td[6].
  3. How to restrict the loop to only retrieve values for 5 rows

This is the code that I am working on:

tableData = soup1.find("table", id="DailySettlementTable")
for rows in tableData.findAll('tr'):
    month = rows.find('td')
    print month

Thank you and appreciate any form of guidance!

Joel
  • 4,732
  • 9
  • 39
  • 54
Lawren
  • 33
  • 3
  • It may be cleaner and easier to extract *all* data into a list of lists, and *then* get the fields you want. It's not as performant, of course, but you probably don't need to worry about that just yet, especially since you're new to python – loopbackbee Dec 18 '13 at 17:19

1 Answers1

1

You probably want to use slicing.

Here's a modified snippet for your code:

table = soup.find('table', id='DailySettlementTable')

# The slice notation below, [2:7], says to take the third (index 2)
# to the eighth (index 7) values from the rows we get.
for rows in table.find_all('tr')[2:7]:
    cells = rows.find_all('td')
    month = cells[0]
    settle = cells[6]

    print month.string + ':' + settle.string
ChrisGPT was on strike
  • 127,765
  • 105
  • 273
  • 257