Absolute sort keys vs. relative sort direction
The problem we're bumping into is that each [ScriptBlock]
passed to Sort-Object
only receives a single input value at a time and returns a key value on which to sort, but with each index having a variable number of sub-indices we can't predict to how many levels we need to compare when there's values we haven't yet seen. What we need is a way to define how two values should be sorted relative to each other.
Fortunately, .NET, upon which (Windows) PowerShell is based, defines the [IComparer[]]
interface that is used by various methods to determine the sort order of values. Its lone method, Compare()
, is passed two values and returns whether they are equal or, otherwise, which one comes before the other. All we have to do is provide our own (simple) implementation of [IComparer[]]
and then we can pass it to a built-in sorting method.
[IComparer[]]
implementation, usage, and test output
[HierarchicalIndexComparer]
: a .NET [IComparer[]]
for index strings
# Source: https://stackoverflow.com/a/71237565/150605
class HierarchicalIndexComparer : System.Collections.Generic.Comparer[String]
{
<#
Implements Comparer[String].Compare()
See https://learn.microsoft.com/dotnet/api/system.collections.generic.comparer-1.compare
When $x -lt $y, returns negative integer
When $x -eq $y, returns 0
When $x -gt $y, returns positive integer
#>
[Int32] Compare([String] $x, [String] $y)
{
# Split each index into components converted to integers with validation
[Int32[]] $xValues = [HierarchicalIndexComparer]::GetComponentValues($x)
[Int32[]] $yValues = [HierarchicalIndexComparer]::GetComponentValues($y)
[Int32] $componentsToCompare = [Math]::Min($xValues.Length, $yValues.Length)
for ($i = 0; $i -lt $componentsToCompare; $i++)
{
[Int32] $componentCompareResult = $xValues[$i].CompareTo($yValues[$i])
# Sort $x and $y by the current component values if they are not equal
if ($componentCompareResult -ne 0)
{
return $componentCompareResult
}
# Otherwise, continue with the next component
}
# The first $componentsToCompare elements of $x and $y are equal
# Sort $x and $y by their component count
return $xValues.Length.CompareTo($yValues.Length)
}
hidden static [Int32[]] GetComponentValues([String] $index)
{
return [Int32[]] (
$index.Split('.').ForEach(
{
if ($_.Length -lt 1)
{
throw "Index string ""$index"" contains an empty sub-index."
}
[Int32] $value = -1
# Leading zeroes will be removed by parsing and not considered when comparing components
if (-not [Int32]::TryParse($_, [System.Globalization.NumberStyles]::None, [System.Globalization.CultureInfo]::InvariantCulture, [Ref] $value))
{
throw "Sub-index ""$_"" of string ""$index"" contains non-digit characters."
}
return $value
}
)
)
}
}
You can see I used a PowerShell class to provide an implementation of the [IComparer[]]
interface for sorting index strings; it derives from the recommended [Comparer[]]
class. See the last section of this answer for an explanation of the logic used by [HierarchicalIndexComparer]
.
Once defined the class
can then be referenced as...
[HierarchicalIndexComparer]
...and instantiated with...
New-Object -TypeName 'HierarchicalIndexComparer'
...or...
[HierarchicalIndexComparer]::new()
Sorting test index strings with a [HierarchicalIndexComparer]
Now that we have our own [IComparer[]]
implementation we can pass an instance of it to a sorting method so it's able to sort index strings. LINQ is a .NET technology that allows you to perform operations on sequences in much the same way as the Group-Object
, Select-Object
, and Where-Object
cmdlets do with PowerShell pipelines. LINQ provides two methods, OrderBy()
and OrderByDescending()
, to perform primary sorting of sequences. After defining some test index strings we just need to pass them and our comparer to one of those methods to get sorted output...
[String[]] $initial = 1, 10, 100 |
ForEach-Object -Process { "$_", "$_.0", "$_.0.0", "$_.0.0.0", "$_.0.1", "$_.1", "$_.1.0", "$_.1.1" }
# Shuffle the elements into a "random" order that is the same between runs
[String[]] $shuffled = Get-Random -Count $initial.Length -InputObject $initial -SetSeed 12345
[String[]] $sorted = [System.Linq.Enumerable]::OrderBy(
$shuffled, # The values to be ordered
[Func[String, String]] { # Selects the key by which to order each value
param($value)
return $value # Return the value as its own key
},
(New-Object -TypeName 'HierarchicalIndexComparer') # The comparer to perform the ordering
) | ForEach-Object -Process {
# Just for demonstration purposes to show that further pipeline elements can be used after sorting
return $_
}
for ($index = 0; $index -lt $initial.Length; $index++)
{
[PSCustomObject] @{
'#' = $index
'$initial' = $initial[$index]
'$shuffled' = $shuffled[$index]
'$sorted' = $sorted[$index]
}
}
The advantage of using [Enumerable]::OrderBy()
with a custom [IComparer[]]
is that it will provide proper sorting with one pass through your index strings, plus you can pipe the output into subsequent cmdlets.
Briefly, the three parameters being passed to OrderBy()
are...
- The sequence of index strings to be sorted
- The "key" on which each index is sorted
- We want each index string to be sorted on the string itself (since
[HierarchicalIndexComparer]
determines the ordering of index [String]
instances), so we just return
the same value that was passed to us; this is similar to ... | Sort-Object -Property { $_ }
- See Can LINQ be used in PowerShell? for more information on what's going on here
- An instance of our custom index comparer
You can also use the Array::Sort()
static method, although that sorts the array passed to it in-place and doesn't return anything for a pipeline to process. Here I'll create a copy of the array first so it can be sorted separately...
# Create a copy of the $shuffled array named $sorted
$sorted = New-Object -TypeName 'System.String[]' -ArgumentList $shuffled.Length
[Array]::Copy($shuffled, $sorted, $shuffled.Length)
[Array]::Sort(
$sorted, # The array to be sorted
(New-Object -TypeName 'SO71232189.HierarchicalIndexComparer') # The comparer with which to sort it
)
Sorted result of test index strings using [HierarchicalIndexComparer]
The above script produces three arrays showing the different transformations of the index collection...
# |
$initial |
$shuffled |
$sorted |
0 |
"1" |
"100.1" |
"1" |
1 |
"1.0" |
"10.0.0.0" |
"1.0" |
2 |
"1.0.0" |
"100.0.1" |
"1.0.0" |
3 |
"1.0.0.0" |
"1.0.0.0" |
"1.0.0.0" |
4 |
"1.0.1" |
"100.0" |
"1.0.1" |
5 |
"1.1" |
"10.0.0" |
"1.1" |
6 |
"1.1.0" |
"1.1" |
"1.1.0" |
7 |
"1.1.1" |
"100.1.1" |
"1.1.1" |
8 |
"10" |
"1.0" |
"10" |
9 |
"10.0" |
"100.0.0" |
"10.0" |
10 |
"10.0.0" |
"1" |
"10.0.0" |
11 |
"10.0.0.0" |
"100.1.0" |
"10.0.0.0" |
12 |
"10.0.1" |
"10.1.0" |
"10.0.1" |
13 |
"10.1" |
"10.0" |
"10.1" |
14 |
"10.1.0" |
"10.1.1" |
"10.1.0" |
15 |
"10.1.1" |
"100.0.0.0" |
"10.1.1" |
16 |
"100" |
"10.1" |
"100" |
17 |
"100.0" |
"1.1.1" |
"100.0" |
18 |
"100.0.0" |
"1.1.0" |
"100.0.0" |
19 |
"100.0.0.0" |
"10" |
"100.0.0.0" |
20 |
"100.0.1" |
"1.0.1" |
"100.0.1" |
21 |
"100.1" |
"10.0.1" |
"100.1" |
22 |
"100.1.0" |
"1.0.0" |
"100.1.0" |
23 |
"100.1.1" |
"100" |
"100.1.1" |
Additional notes
[HierarchicalIndexComparer]
ordering logic
In short, the Compare()
method...
[Int32] Compare([String] $x, [String] $y)
{
# ...
}
...is splitting each index ($x
and $y
) into [Int32]
components, comparing as many components as they have in common, sorting based on any components that are unequal, and if all common components are equal then sorting the shorter index first.
So, for example, when comparing "1.2.3"
to "1.2"
it breaks them up into "1", "2", "3"
and "1", "2"
, respectively, and then performs the following comparisons:
Component |
$x |
$y |
Comparison result |
0 |
1 |
1 |
Equal |
1 |
2 |
2 |
Equal |
Since all 2 components of $y
equal the first 2 components of $x
, it now falls to the index with less components sorting first; $x
has 3 components whereas $y
has 2, therefore $x
sorts after $y
, which is correct: "1.2.3" > "1.2"
.
When comparing "1.2.3"
to "1.20.4"
it breaks them up into "1", "2", "3"
and "1", "20", "4"
, respectively, and then performs the following comparisons:
Component |
$x |
$y |
Comparison result |
0 |
1 |
1 |
Equal |
1 |
2 |
20 |
$x component is less than $y component |
2 |
3 |
4 |
Not evaluated |
Since when comparing the first 3 components of $x
and $y
we find that component 1 of each is unequal, we return the result of that comparison; $x[1] < $y[1]
therefore $x < $y
, which is correct: "1.2.3 < 1.20.4"
.