1

I have a function that can either initialize itself and return an ordered dictionary with initial values, or if a collection is provided as an argument, it manipulates that collection. And that collection is a key within a parent collection.

In my actual code I am seeing an odd behavior, where a key in the initial collection is initially $Null or has a specified value, but when I try to revise that value I do NOT get an error, but I also do not get a changed value. However, when I try creating a minimally functional example to post here, it does work correctly, both in the console and the IDE.

So, given this code

function Write-Data {
    param (
        [System.Collections.Specialized.OrderedDictionary]$Collection
    )

    foreach ($key in $Collection.Keys) {
        try {
            $type = $Collection.$key.GetType().FullName
        } catch {
            $type = 'NULL'
        }
        Write-Host "$key $type $($Collection.$key)"
    }
    Write-Host
}

function Manage-Data {
    param (
        [System.Collections.Specialized.OrderedDictionary]$Collection
    )

    if (-not $Collection) {
        [System.Collections.Specialized.OrderedDictionary]$initialCollection = [Ordered]@{
            initialNull = $null
            initialString = 'initial string'
        }
        return $initialCollection
    } else {
        $Collection.initialNull = 'No longer null'
        $Collection.initialString = 'New String'
    }
}

CLS
$parentContainer = New-Object System.Collections.Specialized.OrderedDictionary
$parentContainer.Add('data', (Manage-Data))

Write-Data $parentContainer.data
Manage-Data -Collection $parentContainer.data
Write-Data $parentContainer.data

is there any obvious scenario where either of the lines revising values would not throw an error, but would also not change the value? For example, if there are actually more functions doing other things with that initialized collection object before the attempt to revise data? Or perhaps more generally, since I am depending on the default byReference behavior of complex objects, is there some situation where this behavior breaks down and I am effectively modifying a new complex object when I think I am modifying the original? Or is the fact that I am having problems with a simple data type within the complex type potentially the issue?

For what it is worth, the idea here is to basically be able to use Dependency Injection, but with functions rather than classes, and also mimic to some extent the the concept of a constructor and a method in a class, but again in a function. And that has generally been working well, if a little messy, which has reenforced in my mind that I need to move to classes eventually. But this particular issue has me worried that I will see the same problem in classes, and unless I can understand it now I will have issues. But since I can't seem to recreate the issue in a simplified example, I seem to be unable to figure anything out.

It occurs to me that one thing I haven't tried is to actually get the memory address of the collection I think I am modifying, or even of the individual key, so I can verify I actually am changing the same data that I initialized. But HOW to get the memory address of a variable escapes me, and is maybe not possible in PowerShell or .NET?

Gordon
  • 6,257
  • 6
  • 36
  • 89

1 Answers1

2

"the default byReference behavior of complex objects" concerns the properties of the object not the object itself:

The difference between (if):

[System.Collections.Specialized.OrderedDictionary]$initialCollection = [Ordered]@{
    initialNull = $null
    initialString = 'initial string'
}

and (else)

$Collection.initialNull = 'No longer null'
$Collection.initialString = 'New String'

Is that the later (else) statements indeed change the values of the parent values as $Collection refers to the same object as $parentContainer.data but the former (if) statement creates a new $initialCollection in the scope of the Manage-Data function which isn't visible in the parent (even if you assign it to $Collection, it would create a new object reference in the scope of the Manage-Data function).

You might return $initialCollection but then, how are you handling the different returns (either a $initialCollection or enumerable null ) in your parent function? Therefore I would just return $initialCollection for both conditions and reassign the object (where the properties are still by reference and only the $parentContainer.data reference will change/reset):

$parentContainer.data = Manage-Data -Collection $parentContainer.data

Potential problems
In other words, the potential issue in your Manage-Data function lies in the fact that parent function needs a different approach in calling it based on the condition if (-not $Collection) which is actually defined within the function. (What will be the value of this condition, as the caller already need to act differently on the condition?)
This leaves two pitfalls:

  • You call the function in the assumption that the argument is not a collection but it actually is:

$parentContainer = [Ordered]@{ data = [Ordered]@{} }
$parentContainer.Add('data', (Manage-Data))

In this case you get an error:

MethodInvocationException: Exception calling "Add" with "2" argument(s): "Item has already been added. Key in dictionary: 'data' Key being added: 'data'"

  • And the opposite (which is less obvious): you call the function in the assumption that the argument is a collection but it is actually not:

$parentContainer = [Ordered]@{}
Manage-Data $ParentContainer.Data

This will leave an unexpected object on the pipeline:
(See: PowerShell Pipeline Pollution)

Name                           Value
----                           -----
initialNull
initialString                  initial string

And doesn't add anything to the $parentContainer object:

$parentContainer # doesn't return anything as it doesn't contain anything

Suggestions

  • See about scopes
  • Enable Set-StrictMode -Version Latest.
    This will show you that a property is potentially empty.
  • Use ([ref]$MyVar).Value = 'new value' to replace a value in a parent scope.
  • (not related to the question) Use the IDictionary interface: [Collections.IDictionary]$Collection to accept a more general collection type in your functions.
iRon
  • 20,463
  • 10
  • 53
  • 79
  • So, what I THOUGHT was happening was that the initialization returns an orderedDictionary, or more correctly a pointer to the memory location of that dictionary, which is saved in `$parentContainer.data`, which is then passed by reference to the function again, for use in the non initialization aspect, and since it's by reference I would be directly modifying the single OrderedDictionary. I wanted this approach as I can then hand this OrderedDictionary to many functions that need to work with it. And I though if the variable is a by reference type, then everything in it is by reference too. – Gordon Nov 27 '21 at 13:50
  • 1
    Good advice, but no copy of `$initialCollection` is being created. Instead, a new hashtable instance constructed from a literal is assigned to it. I'd also avoid the term _by reference_ in this context (except with respect to `[ref]`), because all normal parameter passing is _by value_ - even though in the case of .NET reference types that value happens to be an _object reference_. – mklement0 Nov 27 '21 at 20:21
  • 1
    As an aside: Good to see the link to the `$null` deep dive, but the term _empty null_ strikes me as an unfortunate name for `[System.Management.Automation.Internal.AutomationNull]::Value`. I've heard _Automation null_ being used (which is also unfortunate, but at least alludes to the class name). I myself have used _array-valued null_ / _collection null_ in the past, but was never quite happy with those either. I'm currently leaning toward _null enumerable_ or _enumerable null_. – mklement0 Nov 27 '21 at 20:48
  • @mklement0 I feel like I am still struggling with this concept. Am I correctly understanding that within the initial part of `Manage-Data` a new OrderedDictionary is created and the pointer to that structure is stored in `$initialCollection`. When that variable is returned and assigned to `$parentContainer.data` I now have another variable pointing to the same, single OrderedDictionary in memory. And if I then pass `$parentContainer.data` to the `-Collection` argument of `Manage-Data` I will again be working with the same single OrderedDictionary? – Gordon Nov 28 '21 at 09:03
  • @mklement0 My guess is I am exactly WRONG above, and I don't understand what is actually happening. Which is super frustrating since, at least in this minimal example, the results LOOK like that is what is happening, while in my production code it looks like something else is happening. Ugh. – Gordon Nov 28 '21 at 09:05
  • @Gordon, yes, your description in your penultimate comment is correct (except that it is an _object reference_ - the managed equivalent of an unmanaged _pointer_ - that is returned). Your production code must differ in some way, but without a reproducible example that is hard to diagnose. – mklement0 Nov 28 '21 at 14:40
  • 1
    @mklement0 Thanks for the verification. my code is a right mess, and I am sure somewhere in that mess is a single line that jacks every thing up. Hopefully I can find that line at some point, just so I can sleep better. :) – Gordon Nov 28 '21 at 19:10
  • 1
    Gordon, @mklement0, I have update the answer (additional suggestion: Enable [**`Set-StrictMode -Version Latest`**](https://learn.microsoft.com/powershell/module/microsoft.powershell.core/set-strictmode). This will show you that a property is potentially empty. – iRon Nov 29 '21 at 08:42