0

I have this code in HTML:

<table cellspacing = "0" cellpadding = "0" width = "100%" border="0">
<td class="TOlinha2"><span id="Co">140200586125</span>

I already have a VBA function that accesses a web site, logs in and goes to the right page. Now I'm trying to take the td tags inside a table in HTML. The value I want is 140200586125, but I want a lot of td tags, so I intend to use a for loop to get those tds and put them in a worksheet.

I have tried both:

.document.getElementByClass()

and:

.document.getElementyById()

but neither worked.

Appreciate the help. I'm from Brazil, so sorry about any English mistakes.

Bond
  • 16,071
  • 6
  • 30
  • 53
Alexandre Gentil
  • 149
  • 1
  • 12
  • Can you provide the URL so we can test our solutions? – Bond Aug 20 '15 at 20:06
  • sorry, this is a private web site from my work, now I'm at University, the only thing a can do is tomorrou post a bigger part of the code if it helps. – Alexandre Gentil Aug 20 '15 at 20:15
  • If you tried `.document.getElementById` (I'm presuming the extra y in getElementy<- is a typo), did you also try `.document.getElementById("Co").InnerText`? Its tough to help when we can't see your actual code and the page you are trying to scrape. – Tim Aug 20 '15 at 20:17
  • I haven't tried, But I will. If I don't get a answer before tomorrow I will post all the code, I'm sorry, but right now I don't have it. But thank you for you help. – Alexandre Gentil Aug 20 '15 at 20:21

2 Answers2

1

Since you mentioned you need to retrieve multiple <td> tags, it would make more sense to retrieve the entire collection rather than using getElementById() to get them one-at-a-time.

Based on your HTML above, this would match all <span> nodes within a <td> with a class='TOlinha2':

Dim node, nodeList
Set nodeList = ie.document.querySelectorAll("td.TOlinha2 > span")

For Each node In nodeList
    MsgBox node.innerText     ' This should return the text within the <span>
Next
Bond
  • 16,071
  • 6
  • 30
  • 53
1

There is not enough HTML to determine if the TOlinha2 is a consistent class name for all the tds within the table of interest; and is limited only to this table. If it is then you can indeed use .querySelectorAll

You could use the CSS selector:

ie.document.querySelectorAll(".TOlinha2")

Where "." stands for className.

You cannot iterate over the returned NodeList with a For Each Loop. See my question Excel crashes when attempting to inspect DispStaticNodeList. Excel will crash and you will lose any unsaved data.

You have to loop the length of the nodeList e.g.

Dim i As Long
For i = 0 To Len(nodeList) -1
    Debug.Print nodeList(i).innerText    
Next i

Sometimes you need different syntax which is:

Debug.Print nodeList.Item(i).innerText 

You can seek to further narrow this CSS selector down with more qualifying elements such as, the element must be within tbody i.e. a table, and preceeded by a tr (table row) and have classname .TOLinha2

ie.document.querySelectorAll("tbody tr .TOlinha2")
QHarr
  • 83,427
  • 12
  • 54
  • 101