2

I feel like this is something simple and I'm just not getting it, and I'm not sure if my explanation is great. I have this below JSON file, and I want to get "each App" (App1, App2, App3) under the "New" object

In this script line below I'm essentially trying to replace "TestApp2" with some variable. I guess I'm trying to get TestApp2 as an object without knowing the name. And I realize that the foreach loop doesn't do anything right now

Write-Host $object.Value.TestApp2.reply_urls

JSON:

{
  "New": {
    "App1": {
      "reply_urls": [
        "https://testapp1url1"
      ]
    },
    "App2": {
      "reply_urls": [
        "https://testapp2url1",
        "https://testapp2url2"
      ]
    },
    "App3": {
      "reply_urls": [
        "https://testapp3url1",
        "https://testapp3url2",
        "https://testapp3url3"
      ]
    }
  },
  "Remove": {
      "object_id": [
        ""
      ]
  }
}

Script:

$inputFile = Get-Content -Path $inputFilePath -Raw | ConvertFrom-Json
foreach ($object in $inputFile.PsObject.Properties)
{
    switch ($object.Name)
    {
        New
        {
            foreach ($app in $object.Value)
            {
               Write-Host $object.Value.TestApp2.reply_urls
               # essentially want to replace this line with something like
               # Write-Host $app.reply_urls
            }
        }

        Remove
        {
        }
    }
}

Output:

https://testapp2url1 https://testapp2url2
swtto
  • 125
  • 2
  • 12
  • You want to output the names of the Apps (App1, App2, App3) or the reply_urls of each App? Its not clear – Santiago Squarzon Apr 02 '22 at 18:17
  • I thought it'd be easier to explain using the reply_urls above, but I guess I just really need to get the app name and can work with that from there – swtto Apr 02 '22 at 18:22

2 Answers2

4

You can access the object's PSObject.Properties to get the property Names and property Values, which you can use to iterate over.

For example:

foreach($obj in $json.New.PSObject.Properties) {
    $out = [ordered]@{ App = $obj.Name }
    foreach($url in $obj.Value.PSObject.Properties) {
        $out[$url.Name] = $url.Value
    }
    [pscustomobject] $out
}

Produces the following output:

App  reply_urls
---  ----------
App1 {https://testapp1url1}
App2 {https://testapp2url1, https://testapp2url2}
App3 {https://testapp3url1, https://testapp3url2, https://testapp3url3}

If you just want to output the URL you can skip the construction of the PSCustomObject:

foreach($obj in $json.New.PSObject.Properties) {
    foreach($url in $obj.Value.PSObject.Properties) {
        $url.Value
    }
}
Santiago Squarzon
  • 41,465
  • 5
  • 14
  • 37
  • 1
    Thanks, is there a way to do this without directly referencing the key name `New` in the 1st line? As in `($json.New.PSObject.Properties)` – swtto Apr 02 '22 at 18:41
  • 1
    Yeah, it would be another layer of `PSObject.Properties`, assuming the Property Name you want to target has the name `New`: `$json.PSObject.Properties.Item('New').Value` but again, here you know the name is `New`. – Santiago Squarzon Apr 02 '22 at 18:53
3

Complementing Santiago Squarzon's helpful answer, I've looked for a generalized approach to get JSON properties without knowing the names of their parents in advance.

Pure PowerShell solution

I wrote a little helper function Expand-JsonProperties that "flattens" the JSON. This allows us to use simple non-recursive Where-Object queries for finding properties, regardless how deeply nested they are.

Function Expand-JsonProperties {
    [CmdletBinding()]
    param (
        [Parameter(Mandatory, ValueFromPipeline)] [PSCustomObject] $Json,
        [Parameter()] [string] $Path,
        [Parameter()] [string] $Separator = '/'
    )
    
    process {
        $Json.PSObject.Properties.ForEach{

            $propertyPath = if( $Path ) { "$Path$Separator$($_.Name)" } else { $_.Name }

            if( $_.Value -is [PSCustomObject] ) {
                Expand-JsonProperties $_.Value $propertyPath
            }
            else {
                [PSCustomObject]@{
                    Path   = $propertyPath
                    Value  = $_.Value
                }
            }
        }
    }
}

Given your XML sample we can now write:

$inputFile = Get-Content -Path $inputFilePath -Raw | ConvertFrom-Json

$inputFile | Expand-JsonProperties | Where-Object Path -like '*/reply_urls'

Output:

Path                Value
----                -----
New/App1/reply_urls {https://testapp1url1}
New/App2/reply_urls {https://testapp2url1, https://testapp2url2}
New/App3/reply_urls {https://testapp3url1, https://testapp3url2, https://testapp3url3}

Optimized solution using inline C#

Out of curiosity I've tried out a few different algorithms, including ones that don't require recursion.

One of the fastest algorithms is written in inline C# but can be called through an easy to use PowerShell wrapper cmdlet (see below). The C# code basically works the same as the pure PowerShell function but turned out to be more than 9 times faster!

This requires at least PowerShell 7.x.

# Define inline C# class that does most of the work.

Add-Type -TypeDefinition @'
using System;
using System.Collections.Generic;
using System.Management.Automation;

public class ExpandPSObjectOptions {
    public bool IncludeObjects = false;
    public bool IncludeLeafs = true;
    public string Separator = "/";
}

public class ExpandPSObjectRecursive {

    public static IEnumerable< KeyValuePair< string, object > > Expand( 
            PSObject inputObject, string parentPath, ExpandPSObjectOptions options ) {

        foreach( var property in inputObject.Properties ) {
            
            var propertyPath = parentPath + options.Separator + property.Name;

            if( property.Value is PSObject ) {
                if( options.IncludeObjects ) {
                    yield return new KeyValuePair< string, object >( propertyPath, property.Value );
                }                

                // Recursion
                foreach( var prop in Expand( (PSObject) property.Value, propertyPath, options ) ) {
                    yield return prop;
                }

                continue;
            }

            if( options.IncludeLeafs ) {
                yield return new KeyValuePair< string, object >( propertyPath, property.Value );
            }
        }
    } 
}
'@

# A PowerShell cmdlet that wraps the C# class.

Function Expand-PSObjectRecursive {
    [CmdletBinding()]
    param (
        [Parameter(Mandatory, ValueFromPipeline)] [PSObject] $InputObject,
        [Parameter()] [string] $Separator = '/',
        [Parameter()] [switch] $IncludeObjects,
        [Parameter()] [switch] $ExcludeLeafs
    )

    process {
        $options = [ExpandPSObjectOptions]::new()
        $options.IncludeObjects = $IncludeObjects.ToBool()
        $options.IncludeLeafs   = -not $ExcludeLeafs.ToBool()
        $options.Separator      = $Separator
        
        [ExpandPSObjectRecursive]::Expand( $InputObject, '', $options )
    }
}

The C# code is wrapped by a normal PowerShell cmdlet, so you can basically use it in the same way as the pure PowerShell function, with minor syntactic differences:

$inputFile | Expand-PSObjectRecursive | Where-Object Key -like '*/reply_urls'

I've added some other useful parameters that allows you to define the kind of elements that the cmdlet should output:

$inputFile | 
    Expand-PSObjectRecursive -IncludeObjects -ExcludeLeafs | 
    Where-Object Key -like '*/App*'

Parameter -IncludeObjects also includes PSObject properties from the input, while -ExcludeLeafs excludes the value-type properties, resulting in this output:

Key       Value
---       -----
/New/App1 @{reply_urls=System.Object[]}
/New/App2 @{reply_urls=System.Object[]}
/New/App3 @{reply_urls=System.Object[]}

While the table format output in itself is not too useful, you could use the output objects for further processing, e. g.:

$apps = $inputFile | 
        Expand-PSObjectRecursive -IncludeObjects -ExcludeLeafs | 
        Where-Object Key -like '*/App*'

$apps.Value.reply_urls

Prints:

https://testapp1url1
https://testapp2url1
https://testapp2url2
https://testapp3url1
https://testapp3url2
https://testapp3url3

Implementation notes:

The C# code uses the yield return statement to return properties one-by-one, similar to what we are used from PowerShell pipelines. In inline C# code we can't use the pipeline directly (and it wouldn't be advisable for performance reasons), but yield return allows us to smoothly interface with PowerShell code that can be part of a pipeline.

PowerShell automatically inserts the return values of the C# function into the success output stream, one by one. The difference to returning an array from C# is that we may exit early at any point without having to process the whole input object first (e. g. using Select -First n). This would actually cancel the C# function, something that would otherwise only be possible using multithreading (with all its complexities)! But all of this is just single-threaded code.

zett42
  • 25,437
  • 3
  • 35
  • 72
  • 1
    That's nice! you could do it with a [`Collections.Queue` too](https://gist.github.com/santysq/f96c5ed76f0000b101f92b27164a83ec#file-unroll-ps1-L4-L25) :) – Santiago Squarzon Apr 02 '22 at 20:45
  • 1
    @SantiagoSquarzon Thanks, I know you are a fan of `Collections.Queue`. ;-) Do you think it would be faster? I might refactor to `Queue` as an exercise. – zett42 Apr 02 '22 at 20:47
  • I believe it is faster, not sure if it's faster than a recursive method though (the last example from that gist), argument binding on methods is extremely faster compared to parameter binding on functions. The problem with the `Queue` is that properties will unroll unordered. You should post it as an exercise, it would for sure be adding to your awesome answer :) – Santiago Squarzon Apr 02 '22 at 20:50
  • 1
    @SantiagoSquarzon I've implemented both a queue-based aswell as a stack-based alternative, see [this gist](https://gist.github.com/zett42/a63c329bb7be71cded331436dc483edd). I might try the class unroller another time. – zett42 Apr 02 '22 at 22:27
  • 1
    @SantiagoSquarzon I've slightly simplified the stack-based solution, but the recursive solution looks so much cleaner. Good reason to try and implement the class unroller too. – zett42 Apr 02 '22 at 22:51
  • 1
    @SantiagoSquarzon I've added recursive function and recursive class method to the gist. Also added code to measure performance. Recursive class method and stack are equally as fast, while recursive function is 2.7 times slower! Didn't expect such a big difference. Queue is slightly slower with factor 1.15. – zett42 Apr 03 '22 at 11:05
  • 1
    @SantiagoSquarzon More variants, using generics. Generic queue clearly wins, while recursive function is 3.6 times slower! See detailed measurements at the bottom of the Gist. – zett42 Apr 03 '22 at 12:14
  • Well researched, seems like you have embarked on a journey hehe – Santiago Squarzon Apr 03 '22 at 17:19
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/243565/discussion-between-zett42-and-santiago-squarzon). – zett42 Apr 03 '22 at 20:28
  • Is there a way to reverse this and export json the way it looked before? If I try to change one property for instance. – Jason Foglia Apr 04 '23 at 15:56
  • 1
    @Jason You may use my function [`Set-TreeValue`](https://stackoverflow.com/a/69978942/7571258). E.g. this should be round-trip: `$result = @{}; $json | Expand-JsonProperties | ForEach-Object { Set-TreeValue -Hashtable $result -Path $_.Path -Value $_.Value -PathSeparator '/' }`. Please ask a new question if you need more assistance. – zett42 Apr 04 '23 at 21:12