0

In VBScript, I have this XML document:

<m>
    <x>300001
        <y>10000294
            <z>2120300003</z>
            <z>2120300004</z>
        </y>
        <y>10000295
            <z>2120300003</z>
            <z>2120300006</z>
        </y>
        <y>10000296
            <z>2120300001</z>
        </y>
        <y>10000302
            <z>2120300002</z>
        </y>
    </x>
    <x>300048
        <y>10000294
            <z>2120300002</z>
        </y>
    </x>
</m>

and I'd like to get the values of each of the <y> elements themselves, and not anything under it, into an array to compare the value against another data set later on in a loop. I've done a simple test with outputting the value, and the nodeValue property is always null.

My current code is:

dim matrixArray, y, outputText
set matrixArray = oXml.documentElement.getElementsByTagName("y")

outputText = ""   

for each y in matrixArray 
    outputText = outputText & ", " & y.nodeValue
next

In the above, y.nodeValue is null. If I use y.text, it outputs the text of the node and all the text of <y> element's subtree. Similarly, y.nodeTypedValue also outputs the same as y.text. i.e. 1000029421203000032120300004, 1000029521203000032120300006, 100002962120300001, 100003022120300002, 100002942120300002

Note that I cannot change the structure of the XML document and must comply with it.

How do I just get the text of each of the <y> elements without getting any of the <z> element text under them? i.e. 10000294, 10000295, 10000296, 10000302, 10000294

Krylion
  • 9
  • 1
  • Seems there is no solution to this except to restructure the XML so that there is no text and child nodes under the elements. The duplicate answers indicate that the text should become an attribute if there are child nodes under it as well. Unfortunately, this isn't possible with the XML structure I posed in the question as it comes from another source that I cannot control. If I find a way, I'll answer it here in the comments. – Krylion Jun 22 '21 at 02:05
  • The first linked post has a way, : get the y.text (which contains z.text, `txt = y.Text`) replace the z.text in it (`txt = Replace(txt, z.text, ""`) . Only change is,,as you have multiple `z`'s the `replace` should be done within a `for each z in y.SelectNodes("z")` loop. – Flakes Jun 22 '21 at 04:21
  • 1
    Thanks for spotting that, but it wouldn't work in certain cases. Firstly it'll be slow with large data sets or deeper sub-trees. But the big fault in it is there's an issue where any concatenated values may somehow match a Z value, e.g. Y is 1212, and you have one Z which is 1212, you'll end up with an empty string, or your Y is 1050023, and you have two Z, 207101 and 23207, you'll end up with 10500101 instead of 1050023 – Krylion Jun 23 '21 at 06:46
  • 1
    @Krylion Isn't the text under `` classed as another child `Node` called `#text`? – user692942 Jun 23 '21 at 14:15
  • 1
    @user692942 is right, `y.childNodes(0).text` does give the required value. – Flakes Jun 24 '21 at 07:47

0 Answers0