Why do Python objects have the same address across parent and child processes?

Question

I have an Agent class. I want objects of the agent class to be launched in a new process so I created an AgentLauncher class that extends multiprocessing.Process. The agent class looks like this:

class Agent(object):
  def __init__(self, env, id):
    self.env = env
    self.id = id
    self.x = 0

  def step(self):
    self.x += 1

  def __repr__(self):
    return ('[' + object.__repr__(self) + 
    '\nenv: ' + str(self.env) +  
    '\nid: ' +  str(self.id) +
    '\nx: ' + str(self.x) +
    ']')

while the launcher class is defined like this:

class AgentLauncher(Process):
  def __init__(self, *args, **kwargs):
    super().__init__()
    self.agent = Agent(*args, **kwargs)
    print('pid: ',os.getpid())
    print('launcher: ', self)
    print('agent: ',self.agent)
    self.start()
    sleep(1)
    print('pid: ',os.getpid())
    print('agent: ', self.agent)



  def run(self):
    print("\n")
    print('pid: ',os.getpid())
    print('launcher: ',  self)
    print('agent: ',self.agent)
    self.agent.step()
    self.agent.env['d'] = 4
    print('agent: ',self.agent)
    print('\n')

All the print()'s are there for the sake of debugging.

When I create an object of AgentWrapper with:

launcher = AgentLauncher(env={'a':2,'b':3},id=0)

the __init__() method of AgentLauncher invokes start(). "It (start) arranges for the object’s run() method to be invoked in a separate process." as per the documentation.

I wanted to see which objects are being pickled and copied to the new process. So I added a bunch of print statements. Surprisingly, when printed, all objects (self, self.agent, self.agent.env, ...) seem to be at the same address in the parent and the child processes. Here's the output I see:

pid:  330
launcher:  <AgentLauncher name='AgentLauncher-1' parent=330 initial>
agent:  [<__main__.Agent object at 0x7f9d17f708e0>
env: {'a': 2, 'b': 3}
id: 0
x: 0]


pid:  336
launcher:  <AgentLauncher name='AgentLauncher-1' parent=330 started>
agent:  [<__main__.Agent object at 0x7f9d17f708e0>
env: {'a': 2, 'b': 3}
id: 0
x: 0]
agent:  [<__main__.Agent object at 0x7f9d17f708e0>
env: {'a': 2, 'b': 3, 'd': 4}
id: 0
x: 1]


pid:  330
agent:  [<__main__.Agent object at 0x7f9d17f708e0>
env: {'a': 2, 'b': 3}
id: 0
x: 0]

At first, I thought it had something to do with copy-on-write when the new process is forked. So I updated the env and x variables inside run. The changes don't get reflected back in the parent process so clearly the objects are copies of the parent process's objects.

I want to understand why the object addresses printed are the same across the two processes. I understand that these are not physical addresses but is there a way to print the 'true identity' of an object for the sake of debugging/demonstration?

Remember that addresses don't map directly to physical memory locations. Just because two things in two different processes have address "0x7f9d17f708e0" does not mean they are pointing to the same area of memory. — larsks, Apr 28 '20 at 20:32
Because the virtual address space is the same, they aren't at the same physical address. — juanpa.arrivillaga, Apr 28 '20 at 20:32
What do you mean by "true identity"? That *is* the true identity, as far as the Python process is concerned. — juanpa.arrivillaga, Apr 28 '20 at 20:33
I understand that they aren't at the same physical address, @juanpa.arrivillaga. Suppose I want to demonstrate that they are distinct objects. Is there a way to that other than manually appending the PID of the processes, which is ugly? — farhanhubble, Apr 28 '20 at 20:37
No, `id` which is *simply a number guaranteed to be unique for a given object within a given process for the lifetime of the object* (the fact that it uses the address of the PyObject header *is a CPython implementation detail). It would never show you that you have two processes to begin with. The most straightforward way to show that they are two different processes would be *to use the process id of course*, that's what it's for. — juanpa.arrivillaga, Apr 28 '20 at 20:38
Sorry, I meant if I wanted to show that they are distinct objects. I fixed my comment above. — farhanhubble, Apr 28 '20 at 20:40
There probably is no better way. Using `id` to distinguish objects *within the same process* is tricky to begin with, let alone across different processes. Perhaps someone will have a better idea, but it seems like a totally sound approach to me — juanpa.arrivillaga, Apr 28 '20 at 20:41

Why do Python objects have the same address across parent and child processes?

0 Answers0