I'm trying to read rows from a massive (1.2GB JSON file). I want to read attributes without having to define a class first.
I had to redefine my initial question, as I also want to be able to access a nested value by using a single string path like so: facilities.totalSize.value
Because of the change in requirements, I think the best approach seems to be this as suggested by tgolisch, but my code returns a "Object reference not set to an instance of an object." when I try to access the value.
Dim req As HttpWebRequest = CType(WebRequest.Create("https://www.example.com/sample.json"), HttpWebRequest)
Using resp = req.GetResponse()
Using stream = resp.GetResponseStream()
Using reader = New JsonTextReader(New StreamReader(stream))
While reader.Read()
If reader.TokenType = JsonToken.StartObject Then
Dim jt As JToken = CType(reader.Value, JToken)
'here I also tried creating a jt variable like so: Dim jt As JToken = JToken.Load(reader) but getting the same error
Log("jt totalSize:" + jt.SelectToken("facilities.totalSize.value").Value(Of Object).ToString())
End If
'how can I retrieve the value for "facilities.totalSize.value" attribute here?
End While
End Using
End Using
End Using
sample.json
[
{
"countryid":1,
"price":2997,
"facilities":{
"totalSize":{
"value":80
}
}
},
{
"countryid":1,
"price":250,
"facilities":{
"totalSize":{
"value":30
}
}
}
]
UPDATE 1
I'd like the code to be flexible and work on this format too in listings.json
{
"generatedAt":"2022-05-02 02:03:25",
"listings":[
{
"countryid":1,
"publishdate":"2022-04-02 02:03:25",
"location":{
"neighborhood":"Finthen",
"city":"Mainz",
"country":"Germany"
},
"facilities":{
"bedrooms":{
"value":2
},
"totalSize":{
"value":"100"
}
}
},
{
"countryid":2,
"publishdate":"2022-02-02 02:03:24",
"location":{
"neighborhood":"Ubers",
"city":"NYC",
"country":"USA"
},
"facilities":{
"bedrooms":{
"value":3
},
"totalSize":{
"value":"150"
}
}
}
],
"count":1077
}
I have many differently formatted JSON files because I work with different data partners, some provide feeds that contain tokens called "products", others called "listings", others have no wrapper objects at all. My question now includes samples of both.
Hope that make sense on where I'm coming from.
I got your first code example to work on sample.json
, but as you said, that code does not work listings.json
So, I have to go for example 2 and use the regex. I tried on listings.json
:
Using stream = resp.GetResponseStream()
Dim regex As Regex = New Regex("^\[[0-9]+\]\.facilities\.totalSize\.value$", RegexOptions.CultureInvariant Or RegexOptions.Compiled Or RegexOptions.Singleline)
For Each value As Decimal In JsonExtensions.DeserializeItemsByPath(Of Decimal)(stream, regex)
Log("totalSize", String.Format("totalSize: {0}", value.ToString))
Next
End Using
But it does not return any values, why?
UPDATE 2
Weird, https://codebeautify.org/jsonviewer showed my earlier listings.json
as valid.
I updated listings.json
with more limited fields so it's less cumbersome to read, I validated the format before pasting here.
I incorrectly assumed that having code for selecting a single field would give me enough for me to extend myself to also use that code for multiple field selection.
Anyway, each listing has dozens of fields, but I kept it to a few in my sample json for legibility. What I want to extract from each listing are fields from different types, i.e.:
countryid
of type integer.country
of type stringpublishdate
of type date
And there might be more types in the future, like decimal, boolean etc. I'm not sure what your comment about "bedrooms" means as my regex example was about "totalSize".
Here's what I have now as code, where each For each
block returns a value (except for kindLabel
because I think the string value conflicts with the Decimal value selector), so my regex works. But I don't want to For each
through each occurrence of totalSize
I want to For each
through each listing and get the values totalSize
, bedrooms
, kindLabel
etc. and THEN go to the next item.
So I need to process each individual listing, get its attributes/tokens and then move to the next listing.
Dim regex As Regex = New Regex("^listings\[[0-9]+\]\.facilities\.totalSize\.value$",
RegexOptions.CultureInvariant Or RegexOptions.Compiled Or RegexOptions.Singleline)
For Each value As Decimal In JsonExtensions.DeserializeItemsByPath(Of Decimal)(stream, regex)
ReportError("totalSize", String.Format("totalSize: {0}", value.ToString))
Next
regex = New Regex("^listings\[[0-9]+\]\.facilities\.bedrooms\.value$", RegexOptions.CultureInvariant Or RegexOptions.Compiled Or RegexOptions.Singleline)
For Each value As Decimal In JsonExtensions.DeserializeItemsByPath(Of Decimal)(stream, regex)
ReportError("bedrooms", String.Format("bedrooms: {0}", value.ToString))
Next
regex = New Regex("^listings\[[0-9]+\]\.kindLabel$", RegexOptions.CultureInvariant Or RegexOptions.Compiled Or RegexOptions.Singleline)
For Each value As Decimal In JsonExtensions.DeserializeItemsByPath(Of Decimal)(stream, regex)
ReportError("kindLabel", String.Format("kindLabel: {0}", value.ToString))
Next
I hope the use case and sample data are up to par now :) Thanks for your effort once again!
UPDATE 3
Thank again @dbc. I think I can summarize the thread so far and summarizing what I've learned to this last question: "how can I read multiple values from a single listing object before moving to the next?"
Using stream = resp.GetResponseStream()
'HERE I WANT TO READ VALUE of `facilities.totalSize.value` from first listing object, so "100"
Dim regex As Regex = New Regex("^listings\[[0-9]+\]\.facilities\.totalSize\.value$",
RegexOptions.CultureInvariant Or RegexOptions.Compiled Or RegexOptions.Singleline)
'HERE I WANT TO READ VALUE of `facilities.bedrooms.value` from first listing object, so "2"
regex = New Regex("^listings\[[0-9]+\]\.facilities\.bedrooms\.value$", RegexOptions.CultureInvariant Or RegexOptions.Compiled Or RegexOptions.Singleline)
'HERE I WANT TO READ VALUE of `publishdate` from first listing object, so "2022-04-02 02:03:25"
regex = New Regex("^listings\[[0-9]+\]\.publishdate$", RegexOptions.CultureInvariant Or RegexOptions.Compiled Or RegexOptions.Singleline)
'now move to next listing
End Using