1

I want to create a list on Excel of all the labels of input boxes on a webpage- so I imagine the code would be something like:

Sub IEInteract()

Dim i As Long
Dim URL As String
Dim IE As Object
Dim objCollection As Object

Set IE = CreateObject("InternetExplorer.Application")

IE.Visible = True

URL = "mywebsite.com"

IE.Navigate URL

Do While IE.ReadyState = 4: DoEvents: Loop
Do Until IE.ReadyState = 4: DoEvents: Loop

objCollection = IE.Document.getElementsByTagName("input")

For Each el In objCollection
label = el.label 'or something like that????'
Debug.Print label

Next el

End Sub

Where am I going wrong? Thanks

BTW My VBA is OK, but my HTML is non-existent.

AKA
  • 137
  • 1
  • 7
  • You said *so I imagine the code would be something like*: No, you can't imagine like so. Pick a webpage more or less identical to the one you are working with then describe what to do. This is too damn broad. – SIM May 06 '18 at 11:39
  • @SIM Thanks for lettting me know. I'm practising to be able to generalise, but here's a sample https://www.mrexcel.com/forum/register.php . Where would the differences lie? – AKA May 06 '18 at 11:45

1 Answers1

1
  1. For learning purposes maybe choose a website that has more obvious inputboxes, rather than dropdowns.
  2. Many inputboxes won't be pre-populated so maybe consider reading other properties of the retrieved elements. Or even writing to them and then retrieving those values.
  3. Selecting by tag name can bring back a host of items that you might not have expected.

Bearing all of the above in mind. Try running the following, which generates a collection of <input> tag elements.

Code:

Option Explicit

Public Sub PrintTagInfo()
    'Tools > references > Microsoft XML and HTML Object library
    Dim http As New XMLHTTP60                    '<== this will be specific to your excel version
    Dim html As New HTMLDocument

    With http
        .Open "GET", "https://www.mrexcel.com/forum/register.php", False
        .send
        html.body.innerHTML = .responseText
    End With

    Dim inputBoxes As MSHTML.IHTMLElementCollection, iBox As MSHTML.IHTMLElement, i As Long

    Set inputBoxes = html.getElementsByTagName("input") '<== the collection of input tags on the page
    '<== These are input boxes i.e. you are putting info into them so perhaps populate and then try to read what is in the entry box?
    For Each iBox In inputBoxes
        Debug.Print "Result #" & i + 1
        Debug.Print vbNewLine
        Debug.Print "ID: " & iBox.ID '<== select a sample of properties to print out as some maybe empty
        Debug.Print "ClassName: " & iBox.className,
        Debug.Print "Title: " & iBox.Title
        Debug.Print String$(20, Chr$(61))
        Debug.Print vbNewLine
        i = i + 1
    Next iBox
End Sub

Sample output:

Sample results

From the above, it looks like class name might be in some ways more informative if you are looking to target boxes to input information into.

An initial inspection of the page source, selecting an inputbox and right-click > inspect... will help you refine your choices.

I noticed that a lot of the boxes of interest had the Input tag and then type = "text"

Username box

This means you can target elements matching this pattern using CSS selectors. In this case using the selector input[type=""text""].

Adjusting the former code to factor this in gives a smaller set of more targeted results. Note, using .querySelectorAll, to apply the CSS selector, returns a NodeList object which requires a different method of iterating over. A For Each Loop will cause Excel to crash as described here.

Code:

Option Explicit
Public Sub PrintTagInfo()
    'Tools > references > Microsoft XML and HTML Object library
    Dim http As New XMLHTTP60                    '<== this will be specific to your excel version
    Dim html As New HTMLDocument

    With http
        .Open "GET", "https://www.mrexcel.com/forum/register.php", False
        .send
        html.body.innerHTML = .responseText
    End With

    Dim inputBoxes As Object, i As Long

    Set inputBoxes = html.querySelectorAll("input[type=""text""]") '<== the collection of text input boxes on page. Returned as a NodeList
    '<== These are input boxes i.e. you are putting info into them so perhaps populate and then try to read what is in the entry box?
    For i = 0 To inputBoxes.Length - 1
        Debug.Print "Result #" & i + 1
        Debug.Print vbNewLine
        Debug.Print "ID: " & inputBoxes.Item(i).ID '<== select a sample of properties to print out as some maybe empty
        Debug.Print "ClassName: " & inputBoxes.Item(i).className,
        Debug.Print "Title: " & inputBoxes.Item(i).Title
        Debug.Print String$(20, Chr$(61))
        Debug.Print vbNewLine
    Next i
End Sub

Sample results:

Note: I have edited the spacing to fit more into the image.

Refined results

References added via VBE > Tools > References

References

Last two are those of interest. The bottom one will be version specific and you will need to re-write XMLHTTP60 which is for XML 6.0 to target your version of Excel if not using Excel 2016.

QHarr
  • 83,427
  • 12
  • 54
  • 101
  • Very helpful! So "input" is a 'tag name'? Also, is there any difference between your XML and my: set IE = CreateObject("InternetExplorer.Application") etc? – AKA May 07 '18 at 08:04
  • XML is much faster and your are opening a browser. Input can mean different things but if you are targeting then yes, it is a type of tag. https://www.w3schools.com/tags/tag_input.asp – QHarr May 07 '18 at 08:07
  • Thank you again. I've been able to replicate that to print the innertext from all labels. Is there a 'label.for' property? I can't get it to work. – AKA May 07 '18 at 08:38
  • You can go here to find out the main info https://www.w3schools.com/tags/default.asp .Other than that, right click the webpage and inspect. Or use developer tools like F12 in chrome. Wealth of information in there. Allows you to inspect the xpath, css, tag......... – QHarr May 07 '18 at 08:41
  • Just found it- it's (htmlFor) rather than 'for' by itself. It was a VBA issue, not an HTML one. Really appreciate the help. – AKA May 07 '18 at 08:43
  • "XML is much faster and your are opening a browser" > meaning XML not opening browser; IE you are. Sorry for any confusion. – QHarr May 07 '18 at 08:45