1

I have an html file that contains many "var"s in a section delimited by "<!--";

<!--
var g_stickyTableHeadersScrollVersion=1;... ;var g_priceListInfo={...,"arrProducts":[{"name":"...","type":"...","arrVariants":[{"name":"...","priceGroup":"..."},{"name":"...","priceGroup":"..."},...,{"name":"...","defaultSlabSize":[...,...],"priceGroup":"..."}],{"name":"...","price":...,"isSlabPricing":1}]}...}
--> 

I'm at loss as to how to get the arrProducts array of g_priceListInfo variable values

After many (really a lot of) different attempts I thought I could use querySelector method of HTMLDocument as follows:

    Dim url As String
        url = "C:\Users\...\myHTMLFile.html"

    Dim oFSO As FileSystemObject
    Dim oFS As Object, sText As String

        Set oFSO = New FileSystemObject ' CreateObject("Scripting.FileSystemObject")
        Set oFS = oFSO.OpenTextFile(url)
        Do Until oFS.AtEndOfStream
            sText = oFS.ReadAll()
        Loop

    Dim doc As HTMLDocument
        Set doc = CreateObject("htmlfile")
        doc.body.innerHTML = sText
        Dim ele As Object
            Set ele = doc.querySelector("g_priceListInfo.arrProducts.name")

but, provided that is the right path, I couldn't find the correct syntax to make it work

thanks in advance for any help

EDIT: adding the relevant html page code view snapshots enter image description here enter image description here enter image description here

EDIT 19/08/2022: I finally made it by means of a brute force string manipulation Then I found the no-ScriptControl & no-GitHub JSon parser solution at this link, which gave me the same results of my brute force method

I'd point everybody with the same need as this one of mine to that solution

user3598756
  • 28,893
  • 4
  • 18
  • 28
  • Use a simple regex pattern on sText and deserialize potentially with a JSON parser or use string funcs depending on what you want to do with the collection. – QHarr Aug 17 '22 at 04:41
  • @QHarr, thanks for your answer, I was hoping you would get by, actually. I made some text processing but it didn’work (maybe the string is too long). And I don’t practice Regex.Isn’t there any built in function like GetElementBySomething? – user3598756 Aug 17 '22 at 06:28
  • The above is likely within a script tag which means we would need to see the relevant html housing this content. Depending on the attributes you might be able to write a css selector based on that or we might need to see more html. You have declared as early bound MSHTML.HTMLDocument but instantiated as late bound HTMLFile object (presumably to permit the write of content). If you are not getting complaints about unsupported querySelector then the early bound is the actual object you are working with and css selectors are an option. Then need later string manipulation on node content. – QHarr Aug 17 '22 at 07:49
  • @QHarr, it's all within "" tags The relevant HTML pattern within those tags has been shown (can't show more for privacy rules). The early bound/late bound is an error, and I'm not getting any complaint abount unsopprted querySelector, while Im getting those for the arguments. I updated the post with an image of the html page code view – user3598756 Aug 18 '22 at 07:35
  • It is in a script tag. I can see the closing after the commented (html) bit of interest. You might try something like ` – QHarr Aug 18 '22 at 15:56
  • @QHarr, you're right, there is that closing , indeed. There is not its opening one, though: weìrd! Thanks for the hints. Meanwhile I had gone along the brute force string processing way. Should I not succeed, I'd try the Regex you suggested, after finding out what "take group 1" means... – user3598756 Aug 18 '22 at 17:41
  • https://stackoverflow.com/questions/22542834/how-to-use-regular-expressions-regex-in-microsoft-excel-both-in-cell-and-loops You might have needed to line wrap in dev tools. The opening ` – QHarr Aug 18 '22 at 19:30
  • 1
    @QHarr, thanks again for the hints/links. Meanwhile I ended my brute force string manipulation method successfully. So I checked out the link you gave me and made a quick test that faild (the regex pattern needs refinement, but I wouldn't know how). But then I made a quick search for a JSON parser and found the no-ScriptControl & no-GitHub JSon parse solution at https://stackoverflow.com/questions/6627652/parsing-json-in-excel-vba, which gave me the same results of my brute force method – user3598756 Aug 19 '22 at 08:27

0 Answers0