The problem is the behavior of the open-ended [pscustomobject]
type with respect to equality comparison and as hashtable keys:
[pscustomobject]
is a .NET reference type (that doesn't define custom equality comparisons), so comparing two instances with -eq
tests for reference equality, which means that only values that reference the very same instance are considered equal.[1]
Using [pscustomobject]
instance as the keys of a hashtable is similarly unhelpful, because, as iRon points out, calling .GetHashCode()
on a [pscustomobject]
instance always yields the same value, irrespective of the instance's set of properties and values.[2] Arguably, this is a bug, as discussed in GitHub issue #15806.
Solutions:
If you're willing to use (PSv5+) custom class
es in lieu of [pscustomobject]
instances, Santiago Squarzon's helpful answer offers a solution that relies on a custom class
implementing the System.IEquatable<T>
interface in order to support a custom, class-specific equality test - but note that since such as custom class compares specific, hard-coded properties, it isn't a general replacement for the open-ended [pscustomobject]
type, whose instances can have arbitrary property sets.
iRon's helpful answer provides a generic solution via a custom class
that wraps a hashtable and uses the XML-serialized form of its [pscustomobject]
entries as the entry keys (using the serialization format PowerShell uses for its remoting and background-job infrastructure), relying on the fact that distinct strings with the same content report the same hash code, via .GetHahCode()
. This is probably the best overall solution, because it performs reasonably well while providing a generic comparison that is reasonably robust: it works robustly for value-type property values (as are typical in [pscustomobject]
instances) and tests the properties of reference-type values for value equality, but the necessary limit on serialization depth means that it is at least possible for deeply nested objects with differing property values below the serialization depth to be considered the same - see this answer for more information on PowerShell's serialization and its limitations.
Below is an ad-hoc solution based on iRon's answer that doesn't require defining custom class
es, but it doesn't perform well.
# Available in PSv5+, to allow referencing the [System.Management.Automation.PSSerializer] type
# as just [PSSerializer]; in v4-, use the full type name.
using namespace System.Management.Automation
# Define a *list* rather than an array, because it is
# efficiently extensible
$list = [System.Collections.ArrayList] (
[pscustomobject] @{prop1="bob"; prop2="dude"; prop3="awesome"},
[pscustomobject] @{prop1="alice";prop2="dudette";prop3="awesome"}
)
# Conditionally add two objects to the list:
# One of them is a duplicate and will be ignored.
[pscustomobject]@{prop1="bob";prop2="dude";prop3="awesome"},
[pscustomobject]@{prop1="ted";prop2="dude";prop3="middling"} | ForEach-Object {
if ($list.ForEach({ [PSSerializer]::Serialize($_) }) -cnotcontains [PSSerializer]::Serialize($_)) {
$null = $list.Add($_)
}
}
Note the use of the .ForEach()
array method so as to (relatively) efficiently serialize each element of list $list
, though note that it invariably involves creating a temporary array of the same size, containing the element-specific serializations.
There are ways of optimizing the performance of this code, but if that is needed you may as well use iRon's solution.
[1] For instance, [pscustomobject] @{ foo=1 } -eq [pscustomobject] @{ foo=1 }
yields $false
, because two distinct instances are being compared; that they happen to have the same set of properties and values is irrelevant.
[2] For instance, the following prints the same value twice, despite providing two obviously different objects as input:
[pscustomobject] @{ foo=1 }, [pscustomobject] @{ bar=2 } | % GetHashCode
[3] For instance, ([pscustomobject]@{prop1="bob";prop2="dude";prop3="awesome"}).psbase.ToString()
returns verbatim @{prop1=bob; prop2=dude; prop3=awesome}