Complementing Santiago Squarzon's helpful answer, I've looked for a generalized approach to get JSON properties without knowing the names of their parents in advance.
Pure PowerShell solution
I wrote a little helper function Expand-JsonProperties
that "flattens" the JSON. This allows us to use simple non-recursive Where-Object
queries for finding properties, regardless how deeply nested they are.
Function Expand-JsonProperties {
[CmdletBinding()]
param (
[Parameter(Mandatory, ValueFromPipeline)] [PSCustomObject] $Json,
[Parameter()] [string] $Path,
[Parameter()] [string] $Separator = '/'
)
process {
$Json.PSObject.Properties.ForEach{
$propertyPath = if( $Path ) { "$Path$Separator$($_.Name)" } else { $_.Name }
if( $_.Value -is [PSCustomObject] ) {
Expand-JsonProperties $_.Value $propertyPath
}
else {
[PSCustomObject]@{
Path = $propertyPath
Value = $_.Value
}
}
}
}
}
Given your XML sample we can now write:
$inputFile = Get-Content -Path $inputFilePath -Raw | ConvertFrom-Json
$inputFile | Expand-JsonProperties | Where-Object Path -like '*/reply_urls'
Output:
Path Value
---- -----
New/App1/reply_urls {https://testapp1url1}
New/App2/reply_urls {https://testapp2url1, https://testapp2url2}
New/App3/reply_urls {https://testapp3url1, https://testapp3url2, https://testapp3url3}
Optimized solution using inline C#
Out of curiosity I've tried out a few different algorithms, including ones that don't require recursion.
One of the fastest algorithms is written in inline C# but can be called through an easy to use PowerShell wrapper cmdlet (see below). The C# code basically works the same as the pure PowerShell function but turned out to be more than 9 times faster!
This requires at least PowerShell 7.x.
# Define inline C# class that does most of the work.
Add-Type -TypeDefinition @'
using System;
using System.Collections.Generic;
using System.Management.Automation;
public class ExpandPSObjectOptions {
public bool IncludeObjects = false;
public bool IncludeLeafs = true;
public string Separator = "/";
}
public class ExpandPSObjectRecursive {
public static IEnumerable< KeyValuePair< string, object > > Expand(
PSObject inputObject, string parentPath, ExpandPSObjectOptions options ) {
foreach( var property in inputObject.Properties ) {
var propertyPath = parentPath + options.Separator + property.Name;
if( property.Value is PSObject ) {
if( options.IncludeObjects ) {
yield return new KeyValuePair< string, object >( propertyPath, property.Value );
}
// Recursion
foreach( var prop in Expand( (PSObject) property.Value, propertyPath, options ) ) {
yield return prop;
}
continue;
}
if( options.IncludeLeafs ) {
yield return new KeyValuePair< string, object >( propertyPath, property.Value );
}
}
}
}
'@
# A PowerShell cmdlet that wraps the C# class.
Function Expand-PSObjectRecursive {
[CmdletBinding()]
param (
[Parameter(Mandatory, ValueFromPipeline)] [PSObject] $InputObject,
[Parameter()] [string] $Separator = '/',
[Parameter()] [switch] $IncludeObjects,
[Parameter()] [switch] $ExcludeLeafs
)
process {
$options = [ExpandPSObjectOptions]::new()
$options.IncludeObjects = $IncludeObjects.ToBool()
$options.IncludeLeafs = -not $ExcludeLeafs.ToBool()
$options.Separator = $Separator
[ExpandPSObjectRecursive]::Expand( $InputObject, '', $options )
}
}
The C# code is wrapped by a normal PowerShell cmdlet, so you can basically use it in the same way as the pure PowerShell function, with minor syntactic differences:
$inputFile | Expand-PSObjectRecursive | Where-Object Key -like '*/reply_urls'
I've added some other useful parameters that allows you to define the kind of elements that the cmdlet should output:
$inputFile |
Expand-PSObjectRecursive -IncludeObjects -ExcludeLeafs |
Where-Object Key -like '*/App*'
Parameter -IncludeObjects
also includes PSObject
properties from the input, while -ExcludeLeafs
excludes the value-type properties, resulting in this output:
Key Value
--- -----
/New/App1 @{reply_urls=System.Object[]}
/New/App2 @{reply_urls=System.Object[]}
/New/App3 @{reply_urls=System.Object[]}
While the table format output in itself is not too useful, you could use the output objects for further processing, e. g.:
$apps = $inputFile |
Expand-PSObjectRecursive -IncludeObjects -ExcludeLeafs |
Where-Object Key -like '*/App*'
$apps.Value.reply_urls
Prints:
https://testapp1url1
https://testapp2url1
https://testapp2url2
https://testapp3url1
https://testapp3url2
https://testapp3url3
Implementation notes:
The C# code uses the yield return
statement to return properties one-by-one, similar to what we are used from PowerShell pipelines. In inline C# code we can't use the pipeline directly (and it wouldn't be advisable for performance reasons), but yield return
allows us to smoothly interface with PowerShell code that can be part of a pipeline.
PowerShell automatically inserts the return values of the C# function into the success output stream, one by one. The difference to returning an array from C# is that we may exit early at any point without having to process the whole input object first (e. g. using Select -First n
). This would actually cancel the C# function, something that would otherwise only be possible using multithreading (with all its complexities)! But all of this is just single-threaded code.