1

I am going to get the elements' value from HTML. The HTML code is below.

It can be shown on the screen.

Code screenshot

Dim xmlhttp As Object
Dim url As String
Dim toTranslate As String
Dim htmldoc As HTMLDocument


toTranslate = TextBox1.Value
url = "http://dict.youdao.com/search?q=" & toTranslate & "&keyfrom=dict.index"
Set xmlhttp = CreateObject("MSXML2.XMLHTTP") '创建XML对象

xmlhttp.Open "GET", url, False '用GET 发送请求

xmlhttp.send
'等待响应
Do While xmlhttp.readyState <> 4
    DoEvents
Loop
Dim explore As New InternetExplorer


'Set htmldoc =explore.document

MsgBox xmlhttp.responseText

But I want to get each value of tag "li" in elements which class is "trans-container"(The each Chinese words).

The content I want to get

I only know the method "getElementsByClassName()", but I don't know how to use it. Thanks for your help!

Zach Young
  • 23
  • 1
  • 1
  • 5

2 Answers2

1

You need to create a HTMLDocument object from the response and use it for the parsing. As annotated in the code, it is necessary to use early binding to use the method getElementsByClassName. Try something like the following:

Dim url As String
Dim toTranslate As String

toTranslate = TextBox1.Value
' Note: use https:// rather than http://
url = "https://dict.youdao.com/search?q=" & toTranslate & "&keyfrom=dict.index"

' Creating and sending the request:
Dim xmlhttp As Object
Set xmlhttp = CreateObject("MSXML2.XMLHTTP") '创建XML对象

xmlhttp.Open "GET", url, False '用GET 发送请求
xmlhttp.setRequestHeader "User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"
xmlhttp.send ""

' Getting the response
' This needs to be early bound so the method getElementsByClassName is available!
' The required reference is "Microsoft HTML Object Library"
Dim objHTML as HTMLDocument
Set objHTML = New HTMLDocument

objHTML.body.innerHTML = xmlhttp.responseText

' Parsing the response
Dim objTransContainers as Object, objTransContainer as Object
Dim objLis as Object, objLi as Object
Dim retText as String

Set objTransContainers = objHTML.getElementsByClassName("trans-container")

For Each objTransContainer in objTransContainers 
    Set objLis = objTransContainer.getElementsByTagName("li")
    For Each objLi in objLis
        retText = retText & objLi.innerText & vbNewLine
    Next objLi
Next objTransContainer 

MsgBox retText

Alternatively, you can use only the tags and check for the class name in a loop for the parsing. The advantage is, that this method will also work with a late bound HTMLDocument:

' Getting the response:
Dim objHTML as Object
Set objHTML = CreateObject("htmlFile")

' Note: this objHTML.write will not work with early binding! 
' In that case you have to use the .body.innerHTML 
' like in the code sample above.
With objHTML
    .Open
    .write xmlhttp.responseText
    .Close
End With

' Parsing the response
Dim objDivs as Object, objDiv as Object
Dim objLis as Object, objLi as Object
Dim retText as String

Set objDivs = objHTML.getElementsByTagName("div")

For Each objDiv in objDivs
    If objDiv.className = "trans-container" Then
        Set objLis = objDiv.getElementsByTagName("li")
        For Each objLi in objLis
            retText = retText & objLi.innerText & vbNewLine
        Next objLi
    End If
Next objDiv

MsgBox retText
GWD
  • 3,081
  • 14
  • 30
  • Oh thanks, it is a great solution! But there is an error on my computer. There is an error which code is 70 in "Set objHTML = CreateObject("htmlfile")". It says it's some kinds of denied permission. How to solve it? I already quoted MS HTML Object but it still doesn't work. – Zach Young Dec 05 '20 at 15:02
  • This might happen because your website uses http:// and not https://. Try changing it to https:// if the website supports it. See this [question](https://stackoverflow.com/questions/22938194/xmlhttp-request-is-raising-an-access-denied-error). – GWD Dec 05 '20 at 15:20
  • Thanks ! I've tried those answer on this website, but still the same problem. – Zach Young Dec 05 '20 at 15:38
  • This problem is strange, it still doesn't work, maybe it's not the variable problem and I checked my case, the objHTML is the only. It is that work in your device? – Zach Young Dec 05 '20 at 16:17
  • It works on my device yes... I don't know what the problem is... The variable can't be it... what if you try to use early binding instead? Have 'Microsoft HTML Object Library' added to your references and replace `Dim objHTML as Object` with `Dim objHTML as HTMLDocument` and `Set obHTML = CreateObject("htmlfile")` with `Set objHTML = New HTMLDocument`? (I'm getting a weird compiler error about the .write method (only when using early binding), that this function is not supported by vba, but maybe it works for you?) – GWD Dec 05 '20 at 16:20
  • The old problem disappears, but the new one comes. There is `Function or interface marked as restricted, or the function uses an Automation type not supported in Visual Basic.` in `.Write xmlhttp.responseText`. – Zach Young Dec 05 '20 at 16:37
  • I edited my Answer now, can you try it with the `objHTML.body.innerHTML = objHTTP.responseText` instead of the `.Open` `.Write` `.Colse`? – GWD Dec 05 '20 at 16:47
  • There is a permission denied in `Set objHTML = New HTMLDocument`.This problem is too strange, what is it because of it, even if I put the definition statement on the first line, this problem still occurs. – Zach Young Dec 05 '20 at 16:56
  • Interesting, unfortunately I can not reproduce your problem :/ Maybe it's your best bet to just parse the response string using the method T.M. describes in his answer without using the getElementsByClassName... – GWD Dec 05 '20 at 17:00
  • No matter what, I should thank you for your help, I will try it. Really appreciated. – Zach Young Dec 05 '20 at 17:05
  • @ZachYoung it might even be worth a new question... The issue with the permission denied error at `Set objHTML = New HTMLDocument` when using the first of the options I posted in my answer... because I tested it now, it works perfectly on my machine and I couldn't find anything regarding your problem on the internet ... – GWD Dec 05 '20 at 17:41
  • @GWD Helpful answer juxtaposing two ways +:) ; fyi - you might be interested in a third approach using `FilterXML()` function. – T.M. Dec 05 '20 at 18:19
  • 1
    @T.M. Yes I saw your answer, I like it :) I usually try to avoid WorksheetFunctions to keep keep some cross app portability but often they are very useful and it seems like in this case op has to go with your solution. Do you have any idea why he could get a permission denied error when creating the HTMLDocument object? – GWD Dec 05 '20 at 18:29
  • 1
    @GWD Appreciate your feedback. - As to your question what could be the reason for PermDenied, this post might be of some interest [IE automation permission denied](https://stackoverflow.com/questions/10283496/vba-internet-explorer-automation-permission-denied) – T.M. Dec 05 '20 at 19:10
  • Thank you all. First, in our region, it was midnight, so I have to apologize for the late reply. Just before, I used my classmates' machine to test the code. The code is absolutely correct, it can run. The problem is my machine. Thank you all for your help, they are indeed feasible. – Zach Young Dec 06 '20 at 05:07
1

Array alternative using FilterXML()

Based on a received response string I demonstrate a way to get the list items via FilterXML() available since vers. 2013+.

Function getListItems(ByVal sResponse As String, Optional IsZeroBased As Boolean = False)
'Purpose: assign list items in div trans-container class to 1-dim array
    'XPath search expression
    Dim xp As String
    xp = "//div[@class='trans-container']/ul/li"
    With WorksheetFunction
        'assign list items to 1-based 2-dim array
        Dim listItems: listItems = .FilterXML(sResponse, xp)
        
        'optional Test display in VB Editor's immediate window
        Debug.Print Join(.Transpose(listItems), ", ")
    
        'return <li> items as 1-dim array (optionally 0-based)
        getListItems = .Transpose(listItems)
        If IsZeroBased Then ReDim Preserve getListItems(0 To UBound(getListItems) - 1)
    End With

End Function
T.M.
  • 9,436
  • 3
  • 33
  • 57
  • Thanks, but it says that there is an invalid ReDim in `If IsZeroBased Then ReDim Preserve getListItems(0 To UBound(getitems) - 1)`. How to use it or correct it? – Zach Young Dec 05 '20 at 17:09
  • just a typo - `getListItems` (instead of incorrect `getItems`): should be `If IsZeroBased Then ReDim Preserve getListItems(0 To UBound(getListItems) - 1)` @ZachYoung – T.M. Dec 05 '20 at 17:26