2

Using VBA 7.1 on a MS Access for Office 365 MSO version 16.0.x 64 bit I have a reference to Microsoft HTML Object Library (mshtml.dll) 11.0.x set

I have the following code

    Dim myHTMLDoc As New HTMLDocument
    Dim myEnvironTag As HTMLUnknownElement
    
    myHTMLDoc.body.innerHTML = "some text <environ>EnvironmentURL</environ> some other text"
    
    For Each myEnvironTag In myHTMLDoc.getElementsByTagName("environ")
        MsgBox myEnvironTag.innerHTML
        MsgBox myEnvironTag.innerText
    Next myEnvironTag

From the string <environ>EnvironmentURL</environ>

I am trying to return the string EnvironmentURL

In the code sample above both MessageBoxes are returning zero length strings.

Any idea how to return the string inside custom tags like this?

Thanks in advance

merkman
  • 89
  • 1
  • 12

2 Answers2

1

This workaround works but... not sure why it doesn't work without the replace function changing it into an anchor tag?

    Dim myHTMLDoc As New HTMLDocument
    Dim myEnvironTag As HTMLAnchorElement
    
    myHTMLDoc.body.innerHTML = Replace(Replace("some text <environ>EnvironmentURL</environ> some other text", "<environ>", "<a>"), "</environ>", "</a>")
    
    For Each myEnvironTag In myHTMLDoc.getElementsByTagName("a")
        MsgBox myEnvironTag.innerHTML
        MsgBox myEnvironTag.innerText
    Next myEnvironTag

merkman
  • 89
  • 1
  • 12
  • 1
    `myHTMLDoc.body.children.length` returns 2 - one for the opening tag and one for the closing tag. The TypeName for each is HTMLUnknownElement. That should inherit from HTMLElement, and it probably does, except that it's not recognizing the closing tag so there's nothing "inner" about it. Change to `For Each myEnvironTag In myHTMLDoc.getElementsByTagName("/environ")` and you get the same result. Sorry I don't have an actual solution. – Dick Kusleika Jul 21 '20 at 16:29
0

I was able to run the code snippet from your question and got the desired results ("EnvironmentURL"), so I'm not sure what is the cause here.

Maybe this will depend on your version of Excel, but since HTMLDocument was designed to handle HTML code and "environ" is not a supported element in HTML, maybe an XML parser would be better suited in this task. (More info here)

With that in mind, here's an example of code to extract environ tags and print the text value between the tags :

Dim MyXml As String
'In the XML string, we need to make sure that there is a valid xml starting and closing tag surrounding our string.
MyXml = "<xml>some text <environ>EnvironmentURL</environ> some other text</xml>"

'Make sure to include the Microsoft XML Librairy to your VBA Project
Dim objXML As MSXML2.DOMDocument60 'or DOMDocument for older versions of Excel
Set objXML = New MSXML2.DOMDocument60

If Not objXML.LoadXML(MyXml) Then
    Err.Raise objXML.parseError.ErrorCode, , objXML.parseError.reason
End If
 
Dim elem As IXMLDOMElement
For Each elem In objXML.getElementsByTagName("environ")
    Debug.Print elem.text
Next
DecimalTurn
  • 3,243
  • 3
  • 16
  • 36