3

I making a foray into the world of JSON parsing and NewtonSoft and I'm confused, to say the least.

Take the below PowerShell script:

$json = @"
{
    "Array1": [
        "I am string 1 from array1",
        "I am string 2 from array1"
    ],   

    "Array2": [
        {
           "Array2Object1Str1": "Object in list, string 1",
           "Array2Object1Str2": "Object in list, string 2"
        }
    ]

}
"@

#The newtonSoft way
$nsObj = [Newtonsoft.Json.JsonConvert]::DeserializeObject($json, [Newtonsoft.Json.Linq.JObject])

$nsObj.GetType().fullname #Type = Newtonsoft.Json.Linq.JObject

$nsObj[0] #Returns nothing. Why?

$nsObj.Array1 #Again nothing. Maybe because it contains no key:value pairs?
$nsObj.Array2 #This does return, maybe because has object with kv pairs

$nsObj.Array2[0].Array2Object1Str1 #Returns nothing. Why? but...
$nsObj.Array2[0].Array2Object1Str1.ToString() #Cool. I get the string this way.

$nsObj.Array2[0] #1st object has a Path property of "Array2[0].Array2Object1Str1" Great!

foreach( $o in $nsObj.Array2[0].GetEnumerator() ){
    "Path is: $($o.Path)"
    "Parent is: $($o.Parent)"
} #??? Why can't I see the Path property like when just output $nsObj.Array2[0] ???
#How can I find out what the root parent (Array2) is for a property? Is property even the right word?

I'd like to be able to find the name of the root parent for any given position. So above, I'd like to know that the item I'm looking at (Array2Object1Str1) belongs to the Array2 root parent.

I think I'm not understanding some fundamentals here. Is it possible to determine the root parent? Also, any help in understanding my comments in the script would be great. Namely why I can't return things like path or parent, but can see it when I debug in VSCode.

ScubaManDan
  • 809
  • 8
  • 22
  • 4
    I realize this is not answering the question, but is there a reason why you're not using the default ConvertFrom-Json cmdlet [link](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/convertfrom-json?view=powershell-6)? I was able to parse your json successfully using `$Obj = ConvertFrom-Json $Json` (I had to delete a trailing comma at the end of your json string first). – Nathan W Sep 27 '19 at 20:50
  • 3
    You can't directly index a `JObject` by **integer** index. See: [How to get first key from JObject?](https://stackoverflow.com/a/31415452/3744182). – dbc Sep 27 '19 at 21:27
  • 1
    @NathanWong, Because there are somethings that can't be done with the cmdlet so I want to better understand NewtonSoft. What I'm trying to achieve is nothing like the above. I would like to get the JSON into a calls defined object, rather than just a the default PS Object. Not only that, but with nested classes. e.g. I'll have a class for the items in the arrays, one type for array1 and another for array2 objects. All contained within the main root class. – ScubaManDan Sep 30 '19 at 08:15
  • @dbc: Unlike in C#, in PowerShell the integer-based indexing actually _should_ work, because PowerShell exposes members implemented via interfaces as directly callable type members; however, it currently doesn't, presumably due to [this bug](https://github.com/PowerShell/PowerShell/issues/10654) (as of PowerShell Core 7.0.0-preview.4). – mklement0 Sep 30 '19 at 19:55
  • @mklement0 - the directly-implemented [`JObject.Item Property (Object)`](https://www.newtonsoft.com/json/help/html/P_Newtonsoft_Json_Linq_JObject_Item.htm), which throws, wouldn't take preference over the explicitly implemented [`JToken IList.this[int index]`](https://github.com/JamesNK/Newtonsoft.Json/blob/master/Src/Newtonsoft.Json/Linq/JContainer.cs#L936) on the base class `JContainer`? `JObject` implements a huge bunch of `Item` accessors as well as `GetEnumerator()` methods - directly, explicitly, and inherited from the base - making it hard to see which one actually gets called. – dbc Sep 30 '19 at 20:19
  • @mklement0 - OK I read the bug - it actually says what I said in the above comment. I'm just less certain it's a bug as opposed to an design issue, which is a good reason to report such a bug. Good job there. – dbc Sep 30 '19 at 20:23
  • Thanks, @dbc; as for throwing: PowerShell quietly eats the exception when you call with `[...]`, but not with `.Item(...)`, which is also problematic: see https://github.com/PowerShell/PowerShell/issues/10655 – mklement0 Sep 30 '19 at 20:25

2 Answers2

9

dbc's answer contains helpful background information, and makes it clear that calling the NewtonSoft Json.NET library from PowerShell is cumbersome.

Given PowerShell's built-in support for JSON parsing - via the ConvertFrom-Json and ConvertTo-Json cmdlets - there is usually no reason to resort to third-party libraries (directly[1]), except in the following cases:

  • When performance is paramount.
  • When the limitations of PowerShell's JSON parsing must be overcome (lack of support for empty key names and keys that differ in letter case only).
  • When you need to work with the Json.NET types and their methods rather than with the method-less "property-bag" [pscustomobject] instances ConvertFrom-Json constructs.

While working with NewtonSoft's Json.NET directly in PowerShell is awkward, it is manageable, if you observe a few rules:

  • Lack of visible output doesn't necessarily mean that there isn't any output at all:

    • Due to a bug in PowerShell (as of v7.0.0-preview.4), [JValue] instances and [JProperty] instances containing them produce no visible output by default; access their (strongly typed) .Value property instead (e.g., $nsObj.Array1[0].Value or $nsProp.Value.Value (sic))

    • To output the string representation of a [JObject] / [JArray] / [JProperty] / [JValue] instance, do not rely on output as-is (e.g, $nsObj), use explicit stringification with .ToString() (e.g., $nsObj.ToString()); while string interpolation (e.g., "$nsObj") does generally work, it doesn't with [JValue] instances, due to the above-mentioned bug.

    • [JObject] and [JArray] objects by default show a list of their elements' instance properties (implied Format-List applied to the enumeration of the objects); you can use the Format-* cmdlets to shape output; e.g., $nsObj | Format-Table Path, Type.

      • Due to another bug (which may have the same root cause), as of PowerShell Core 7.0.0-preview.4, default output for [JObject] instances is actually broken in cases where the input JSON contains an array (prints error format-default : Target type System.Collections.IEnumerator is not a value type or a non-abstract class. (Parameter 'targetType')).
  • To numerically index into a [JObject] instance, i.e. to access properties by index rather than by name, use the following idiom: @($nsObj)[<n>], where <n> is the numerical index of interest.

    • $nsObj[<n>] actually should work, because, unlike C#, PowerShell exposes members implemented via interfaces as directly callable type members, so the numeric indexer that JObject implements via the IList<JToken> interface should be accessible, but isn't, presumably due to this bug (as of PowerShell Core 7.0.0-preview.4).

    • The workaround based on @(...), PowerShell's array-subexpression operator, forces enumeration of a [JObject] instance to yield an array of its [JProperty] members, which can then be accessed by index; note that this approach is simple, but not efficient, because enumeration and construction of an aux. array occurs; however, given that a single JSON object (as opposed to an array) typically doesn't have large numbers of properties, this is unlikely to matter in practice.
      A reflection-based solution that accesses the IList<JToken> interface's numeric indexer is possible, but may even be slower.

    • Note that additional .Value-based access may again be needed to print the result (or to extract the strongly typed property value).

  • Generally, do not use the .GetEnumerator() method; [JObject] and [JArray] instances are directly enumerable.

    • Keep in mind that PowerShell may automatically enumerate such instances in contexts where you don't expect it, notably in the pipeline; notably, when you send a [JObject] to the pipeline, it is its constituent [JProperty]s that are sent instead, individually.
  • Use something like @($nsObj.Array1).Value to extract the values of an array of primitive JSON values (strings, numbers, ...) - i.e, [JValue] instances - as an array.

The following demonstrates these techniques in context:

$json = @"
{
    "Array1": [
        "I am string 1 from array1",
        "I am string 2 from array1",
    ],

    "Array2": [
        {
           "Array2Object1Str1": "Object in list, string 1",
           "Array2Object1Str2": "Object in list, string 2"
        }
    ]

}
"@

# Deserialize the JSON text into a hierarchy of nested objects.
# Note: You can omit the target type to let Newtonsoft.Json infer a suitable one.
$nsObj = [Newtonsoft.Json.JsonConvert]::DeserializeObject($json)
# Alternatively, you could more simply use:
#   $nsObj = [Newtonsoft.Json.Linq.JObject]::Parse($json)

# Access the 1st property *as a whole* by *index* (index 0).
@($nsObj)[0].ToString()

# Ditto, with (the typically used) access by property *name*.
$nsObj.Array1.ToString()

# Access a property *value* by name.
$nsObj.Array1[0].Value

# Get an *array* of the *values* in .Array1.
# Note: This assumes that the array elements are JSON primitives ([JValue] instances.
@($nsObj.Array1).Value

# Access a property value of the object contained in .Array2's first element by name:
$nsObj.Array2[0].Array2Object1Str1.Value


# Enumerate the properties of the object contained in .Array2's first element
# Do NOT use .GetEnumerator() here - enumerate the array *itself*
foreach($o in $nsObj.Array2[0]){
  "Path is: $($o.Path)"
  "Parent is: $($o.Parent.ToString())"
}

[1] PowerShell Core - but not Windows PowerShell - currently (v7) actually uses NewtonSoft's Json.NET behind the scenes.

mklement0
  • 382,024
  • 64
  • 607
  • 775
4

You have a few separate questions here:

  1. $nsObj[0] #Returns nothing. Why?

    This is because nsObj corresponds to a JSON object, and, as explained in this answer to How to get first key from JObject?, JObject does not directly support accessing properties by integer index (rather than property name).

    JObject does, however, implement IList<JToken> explicitly so if you could upcast nsObj to such a list you could access properties by index -- but apparently it's not straightforward in PowerShell to call an explicitly implemented method. As explained in the answers to How can I call explicitly implemented interface method from PowerShell? it's necessary to do this via reflection.

    First, define the following function:

    Function ChildAt([Newtonsoft.Json.Linq.JContainer]$arg1, [int]$arg2) 
    {
        $property = [System.Collections.Generic.IList[Newtonsoft.Json.Linq.JToken]].GetProperty("Item")
        $item =  $property.GetValue($nsObj, @([System.Object]$arg2))
    
        return $item
    }
    

    And then you can do:

    $firstItem = ChildAt $nsObj 0
    

    Try it online here.

  2. #??? Why can't I see the Path property like when just output $nsObj.Array2[0] ???

    The problem here is that JObject.GetEnumerator() does not return what you think it does. Your code assumes it returns the JToken children of the object, when in fact it is declared as

    public IEnumerator<KeyValuePair<string, JToken>> GetEnumerator()
    

    Since KeyValuePair<string, JToken> doesn't have the properties Path or Parent your output method fails.

    JObject does implement interfaces like IList<JToken> and IEnumerable<JToken>, but it does so explicitly, and as mentioned above calling the relevant GetEnumerator() methods would require reflection.

    Instead, use the base class method JContainer.Children(). This method works for both JArray and JObject and returns the immediate children in document order:

    foreach( $o in $nsObj.Array2[0].Children() ){
        "Path is: $($o.Path)"
        "Parent is: $($o.Parent)"
    }
    

    Try it online here.

  3. $nsObj.Array1 #Again nothing. Maybe because it contains no key:value pairs?

    Actually this does return the value of Array1, if I do

    $nsObj.Array1.ToString() 
    

    the JSON corresponding to the value of Array1 is displayed. The real issue seems to be that PowerShell doesn't know how to automatically print a JArray with JValue contents -- or even a simple, standalone JValue. If I do:

    $jvalue = New-Object Newtonsoft.Json.Linq.JValue 'my jvalue value'
    '$jvalue'             #Nothing output
    $jvalue
    '$jvalue.ToString()'  #my jvalue value
    $jvalue.ToString()
    

    Then the output is:

    $jvalue
    $jvalue.ToString()
    my jvalue value
    

    Try it online here and, relatedly, here.

    Thus the lesson is: when printing a JToken hierarchy in PowerShell, always use ToString().

    As to why printing a JObject produces some output while printing a JArray does not, I can only speculate. JToken implements the interface IDynamicMetaObjectProvider which is also implemented by PSObject; possibly something about the details of how this is implemented for JObject but not JValue or JArray are compatible with PowerShell's information printing code.

dbc
  • 104,963
  • 20
  • 228
  • 340
  • @dbc: Great background info; re printing (as of PowerShell Core 7.0.0-preview.4 / Windows PowerShell v5.1): `JObject` and `JArray` instances do produce visible output, but `JValue` instances don't, due to [this bug](https://github.com/PowerShell/PowerShell/issues/10652); in PowerShell _Core_ `JObject` instances containing arrays even cause an exception on printing to the display; see [this bug](https://github.com/PowerShell/PowerShell/issues/10650). It looks like both symptoms have the same root cause. – mklement0 Sep 30 '19 at 15:07
  • Thanks again dbc. I found this very helpful in improving my understanding. I really appreciate it. I wish I could mark two answers. This would have been enough, but mklements answer really expanded on a few things I could incorporate into my module. I'll be referring to the page many times to come I'm sure. – ScubaManDan Oct 01 '19 at 16:23