0

I'm having a bit of trouble parsing a string from an XML. Here is an example of a tag that I have:

<author>   {"picture":"http:\/\/images.ak.instagram.com\/profiles\/profile_315469453_75sq_1381947801.jpg","name":"Natural Places And Views","username":"alhamadikh","link":"http:\/\/instagram.com\/alhamadikh"}
</author>

I've been able to parse all of the individual tags correctly from the XML that I have (things like the following: title/image/link) except for the "picture", "name", and "username" inside of the author tag. What do I need to code to extract the information inside of the author tag individually?

Here is a sample of a line of code that works when I need to extract a title:

FindComponent("MainText"&i).text = x.selectsinglenode("title").text

Here is what i tried to extract the picture inside of author (unsuccessfully):

FindComponent("AvatarImage"&i).text = x.selectsinglenode("author/picture").text

Any help or advice would be appreciated, I've been googling for hours and can't seem to find the right answer. Also i'm using VBscript as the language to extract the information.

Thank you!

Aldina
  • 9
  • 4
  • That's [JSON](http://en.wikipedia.org/wiki/JSON) inside an XML node. Seems not all too easy to deal with it (full-fledged) in vbscript - [see here](http://stackoverflow.com/questions/12153925/decode-encode-json-with-vbscript) – KekuSemau Feb 26 '14 at 19:16
  • Aha! Thank you for explaining that. I will try and see if the developer can further divide the JSON part of the "author" XMLnode into their own nodes. – Aldina Feb 26 '14 at 19:45

2 Answers2

0

The .text of the autor node is just that: text. You can't use XML to parse it further. You can use a RegExp to cut/split quoted key-value pairs:

Option Explicit

Dim sAT : sAT =   "    {""picture"":""http:\/\/images.ak.instagram.com\/profiles\/profile_315469453_75sq_1381947801.jpg"",""name"":""Natural Places And Views"",""username"":""alhamadikh"",""link"":""http:\/\/instagram.com\/alhamadikh""}"
Dim reX : Set reX = New RegExp
reX.Global  = True
reX.Pattern = """([^""]+)"":""([^""]+)"""
Dim oMTS : Set oMTS = reX.Execute(sAT)
Dim oMT
For Each oMT In oMTS
    WScript.Echo oMT.SubMatches(0)
    WScript.Echo " ", oMT.SubMatches(1)
Next

output:

cscript 22050687.vbs
picture
  http:\/\/images.ak.instagram.com\/profiles\/profile_315469453_75sq_1381947801.jpg
name
  Natural Places And Views
username
  alhamadikh
link
  http:\/\/instagram.com\/alhamadikh

For further processing it would be a good idea to put the key-value pairs into a dictionary.

Ekkehard.Horner
  • 38,498
  • 2
  • 45
  • 96
0

As @KekuSemau pointed out, the text inside the <author> tag is JSON, not XML, so you can't use XPath for selecting the image URL. You don't necessarily need a full-blown JSON parser if you just want to extract one URL from a JSON object, though. A regular expression might suffice for that as long as the requirements don't get any more complex.

Set re = New RegExp
re.Pattern = """picture"":""(.*?)"""

For Each m In re.Execute(x.SelectSingleNode("author").text)
  WScript.Echo m.Submatches(0)
Next
Ansgar Wiechers
  • 193,178
  • 25
  • 254
  • 328