4

Please see this related question for some background information.

When I say "invalid reference" I mean a reference that points to no data.


Assume we have the following data structure containing cyclic references:

       +-----------------------------------------------------+
       |                                                     |
       +-->+============+    +==========+                    |
           [ Reference ----->[ Blessed  ]                    |
$parent -->+============+    [ Hash     ]                    |
                             [          ]   +==========+     |
                             [ children --->[ Array    ]     |
                             [          ]   [          ]     |
                             +==========+   [ 0: ---------+  |
                                            [          ]  |  |
                                            +==========+  |  |
                                                          |  |
       +--------------------------------------------------+  |
       |                                                     |
       +-->+============+    +==========+                    |
           [ Reference ----->[ Blessed  ]                    |
$child --->+============+    [ Hash     ]                    |
                             [          ]                    |
                             [ parent: ----------------------+
                             [          ]
                             +==========+

I understand that I can use Scalar::Util's weaken function to "weaken" references . . . but what happens if I weaken the reference from parent->child and also weaken the reference from child->parent and then either $child or $parent goes out of scope, but not the other?

Example: $parent goes out of scope so the reference is gone.

       +-----------------------------------------------------+
       |                                                     |
       +-->+============+    +==========+                    |
           [ Reference ----->[ Blessed  ]                    |
           +============+    [ Hash     ]                    |
                             [          ]   +==========+     |
                             [ children --->[ Array    ]     |
                             [          ]   [          ]     |
                             +==========+   [ 0: ---------+  |
                                            [          ]  |  |
                                            +==========+  |  |
                                                          |  |
                 would this break the link? ------------> X  X
                                                          |  |
       +--------------------------------------------------+  |
       |                                                     |
       +-->+============+    +==========+                    |
           [ Reference ----->[ Blessed  ]                    |
$child --->+============+    [ Hash     ]                    |
                             [          ]                    |
                             [ parent: ----------------------+ <--- would this parent object pointer now be invalid?
                             [          ]
                             +==========+

If I did this, and then the "parent" went out of scope, would the parent object be removed from memory because Perl's internal reference count for that object goes to 0? I ask this, because if $child still exists and needs to use some data from the parent object this would cause problems because the child object would now hold an invalid pointer to the parent.

brian d foy
  • 129,424
  • 31
  • 207
  • 592
tjwrona1992
  • 8,614
  • 8
  • 35
  • 98

2 Answers2

9

I'll use the simpler structure created by the following code to answer your questions:

my $x = { };
my $y = $x;
weaken($y);

The following illustrates what this does step by step.

  1. my $x = { };

              +============+      +==========+
    $x -----> [ Reference ------->[ Hash     ]
              [ REFCNT=1   ]      [ REFCNT=1 ]
              +============+      [          ]
                                  +==========+
    
  2. my $y = $x;

              +============+      +==========+
    $x -----> [ Reference ------->[ Hash     ]
              [ REFCNT=1   ]  +-->[ REFCNT=2 ]
              +============+  |   [          ]
                              |   +==========+
              +============+  |
    $y -----> [ Reference ----+
              [ REFCNT=1   ]
              +============+
    
  3. weaken($y);

              +============+      +==========+
    $x -----> [ Reference ------->[ Hash     ]
              [ REFCNT=1   ]  +-->[ REFCNT=1 ]
              +============+  |   [ BACKREFS ---+
                              |   +==========+  |
              +============+  |                 |
    $y -----> [ Weak Ref -----+                 |
         +--> [ REFCNT=1   ]                    |
         |    +============+                    |
         +--------------------------------------+
    

    In addition to setting the WEAKREF flag in the reference, the referenced variable's reference count was lowered, and a backreference was created.

Scenario 1

If $y goes out of scope or is set to a different value, the second reference's REFCNT will drop to zero, which will free the reference. This would normally drop the hash's reference count, except the freed reference was a weak reference. So it will simply remove itself from the list of backreferences instead.

          +============+      +==========+
$x -----> [ Reference ------->[ Hash     ]
          [ REFCNT=1   ]      [ REFCNT=1 ]
          +============+      [          ]
                              +==========+

Scenario 2

If $x goes out of scope or is set to a different value, the first reference's REFCNT will drop to zero, which will free the reference, which will drop the reference count of the hash to zero, which will cause the hash to be freed. As part of that, each backreferenced variables will be made undef.

          +============+
$y -----> [ Undefined  ]
          [ REFCNT=1   ]
          +============+

At this point print("$y->{foo}\n"); will croak (exit with an error message, not a segmentation violation), which you can avoid by checking if $y is defined first.

ikegami
  • 367,544
  • 15
  • 269
  • 518
5

It wouldn't be an "invalid reference", it would be undef. When the last non-weak reference to something goes out of scope (or is weakened), then all weak references to that something become undef.

But yes, if the parent has only weak references to its children, and the children have only a weak reference to their parent, then if the only strong reference $parent goes out of scope, any children you still have other references to will now have undef in their parent field.

cjm
  • 61,471
  • 9
  • 126
  • 175
  • So then what is the best practice? Is it to make parents always have strong references to children and children always have weak references to the parent? I believe that would make it so if a child goes out of scope but the parent is still in scope, the child would still exist in memory. – tjwrona1992 Aug 14 '15 at 16:19
  • 2
    It depends on the exact structure you've got and the way you expect it to be used. I think the weak ref from child to parent is the most common, but there may be instances where the other direction would be better. – cjm Aug 14 '15 at 17:16
  • Funny . . . I was thinking to myself why would you ever do it the other way around, but as my problem unravels it seems like that might actually be the better way to go! – tjwrona1992 Aug 19 '15 at 15:27