1

I know this is really easy for some of you out there. But I have been going deep on the internet and I can not find an answer. I need to get the company name that is inside the

tbody tr td a eBay-tradera.com

and

td class="bS aR" 970,80

/td /tr /tbody

<tbody id="matrix1_group0">
<tr class="oR" onmouseover="onMouseOver(this, false)" onmouseout="onMouseOut(this, false)" onclick="onClick(this, false)">
<td class="bS">&nbsp;</td>
<td>
<a href="aProgramInfoApplyRead.action?programId=175&amp;affiliateId=2014848" title="http://www.tradera.com/" target="_blank">

   eBay-Tradera.com

</a>
</td>
<td class="aR">
<a href="/pan/aProgramInfoApplyRead.action?programId=175&amp;affiliateId=2014848" title="/pan/aProgramInfoApplyRead.action?programId=175&amp;affiliateId=2014848">175</a></td>
<td class="bS aR">0</td><td class="bS aR">0</td><td class="bS aR">187</td>
<td class="aR">0,00%</td><td class="bS aR">124</td>
<td class="aR">0,00%</td>
<td class="bS aR">26</td>
<td class="aR">20,97%</td>
<td class="bS aR">32</td>
<td class="aR">60,80</td>
<td class="aR">25,81%</td>
<td class="bS aR">5&nbsp;102,00</td>
<td class="bS aR">0,00</td>
<td class="aR">0,00</td>
<td class="bS aR">

   970,80

</td>
</tr>
</tbody>

This is my code, where I only try to get the a tag to start of with but I cant get that to work either

Set TDelements = document.getElementById("matrix1_group0").document.getElementsbytagname("a").innerHTML
r = 0
C = 0

For Each TDelement In TDelements
Blad1.Range("A1").Offset(r, C).Value = TDelement.innerText
r = r + 1
Next

Thanks on beforehand I know that this might be to simple. But I hope that other people might have the same issue and this will be helpful for them as well. The reason for the "r = r + 1" is because there are many more companies on this list. I just wanted to make it as easy as I could. Thanks again!

Dijkgraaf
  • 11,049
  • 17
  • 42
  • 54
  • 1
    I imagine you're not giving us the link because it's a password-protected section, but could you please snapshot the table how it looks like on the web? – Matteo NNZ Nov 24 '14 at 22:12
  • The `document` in `document.getElementById("matrix1_group0")...` needs its parent like `ie.document.getElementById("matrix1_group0")...`. –  Nov 24 '14 at 23:23

2 Answers2

2

You will need to specify the element location in the table. Ebay seems to be obfuscating the class-names so we cannot rely on those being consistent. Nor would I usually rely on the elements by their table index being consistent but I don't see any way around this.

I am assuming that this is the HTML document you are searching

<tbody id="matrix1_group0">
  <tr class="oR" onmouseover="onMouseOver(this, false)" onmouseout="onMouseOut(this, false)" onclick="onClick(this, false)">
    <td class="bS">&nbsp;</td>
    <td>
      <a href="aProgramInfoApplyRead.action?programId=175&amp;affiliateId=2014848" title="http://www.tradera.com/" target="_blank">
        eBay-Tradera.com  <!-- <=== You want this? -->
      </a>
    </td>
    <!-- ... -->
  </tr>
  <!-- ... -->
</tbody>

We can ignore the rest of the document as the table element has an ID. In short, we assume that

.getElementById("matrix1_group0").getElementsByTagName("TR")

will return a collection of html row objects sorted by their appearance.

Set matrix = document.getElementById("matrix1_group0")
Set firstRow = matrix.getElementsByTagName("TR")(1)
Set firstRowSecondCell = firstRow.getElementsByTagName("TD")(2)
traderaName = firstRowSecondCell.innerText

Of course you could inline this all as

document.getElementById("matrix1_group0").getElementsByTagName("TR")(1).getElementsByTagName("TD")(2).innerText

but that would make debugging harder. Also if the web-page is ever presented to you in a different format then this won't work. Ebay is deliberately making it hard for you to scrape data off of it for security.

cheezsteak
  • 2,731
  • 4
  • 26
  • 41
  • 2
    Well, after reading Cheezsteak's answer, I just remove mine and suggest you to follow his approach. It's much more flexible and well explained that what I did before, better focus the attention on the best answer rather than having two similar ones. +1 – Matteo NNZ Nov 24 '14 at 22:32
  • I get the "Run time error 424, object required" Is there something that I have missed to tell you? – Niklas Andersson Nov 25 '14 at 21:11
  • @NiklasAndersson First confirm that the document hasn't changed it's structure. Otherwise see this [answer](http://stackoverflow.com/questions/25749333/internet-explorer-fill-input-box-with-text-mask-using-vba/25753120#25753120) about waiting for the elements to load before attempting to access them. – cheezsteak Nov 25 '14 at 21:48
  • I am doing the Next Do While ie.busy: DoEvents: Loop While ie.ReadyState <> 4 DoEvents Wend I have picked up data at the page. Just not the correct data. It still get stuck at "Set matrix =" – Niklas Andersson Nov 30 '14 at 14:45
1

With only the HTML you have shown you can use CSS selectors to obtain these:

a[href*='aProgramInfoApplyRead.action?programId']

Which says a tag with attribute href that contains the string 'aProgramInfoApplyRead.action?programId'. This matches two elements but the first is the one you want.

CSS Selector:

CSS selector


VBA:

You can use .querySelector method of .document to retrieve the first match

Debug.Print ie.document.querySelector("a[href*='aProgramInfoApplyRead.action?programId']").innerText
QHarr
  • 83,427
  • 12
  • 54
  • 101