-1

Hi we are running this code and it is driving my crazy

  • we capture a data table in table this works
  • then grab all th and it's text in sizes this works
  • then we want to grab all underlying rows in TR; and after loop over columns in rows : does not work! the color_rows object is always empty .. but when testing with xpath in the browser it does! work ... why? how?

My question is: how can I grab the tbody/tr's?

Expected flow

  1. loop over TR's

  2. Access, TR 1 by 1, get 1st TD

  3. Get all TD's data that have class form-control

     table = response.xpath('//div[@class="content"]//table[contains(@class,"table")]')
     sizes = table.xpath('./thead//th/text()').getall()[1:] #works!
     color_rows = table.xpath('./tbody/tr') #does not work! object empty
     for color_row in color_rows:
         color = color_row.xpath('/td[1]/b/text()').get().strip()
         print(color)
         stocks = color_row.xpath('/td/div[input[@class="form-control"]]/div//text()').getall()
         for size, stock in zip(sizes, stocks)
    

Our html data looks like this

<table class="table">
    <thead>
        <tr>
            <th id="ctl00_cphCEShop_colColore" class="text-left" colspan="2">Colore</th>
                <th>S</th>
                <th>M</th>
                <th>L</th>
            </tr>
    </thead>
    
    <tbody>
        <tr>
            <td id="x">
                <b>White</b>
                <input type="hidden" name="data" value="3230/201">
            </td>
            <td id="avail">
                Avail:
            </td>
            <td id="1">
                <div>
                    <input name="cell" type="text" class="form-control">
                    <div class="text-center">179</div>
                </div>
            </td>
            <td id="2">
                <div>
                    <input name="cell" type="text" class="form-control">
                    <div class="text-center">360</div>
                </div>
            </td>
etc etc
snh_nl
  • 2,877
  • 6
  • 32
  • 62

1 Answers1

0

Apparently tbody tags are often omitted in HTML but aded by the browser.

In this case there was no (real) body tag making the xpath object miss!

And hence the troubles with xpath (if you really think the tbody tag is there)

Why do browsers insert tbody element into table elements?

snh_nl
  • 2,877
  • 6
  • 32
  • 62