5

Get this simple python code, same matching with re.compile instance. I noticed that even though I am using the very same value, it creates two instances, and repeats them accordingly.

I wonder if one can tell the reason for this behavior,

  • Why does it create the second instance at all?
  • Why only two?
  • And why each time picked the other one and not randomly?

the CLI code:

>>> import re
>>>
>>> rec = re.compile("(?:[-a-z0-9]+\.)+[a-z]{2,6}(?:\s|$)")
>>>
>>> rec.match('www.example.com')
<_sre.SRE_Match object at 0x23cb238>
>>> rec.match('www.example.com')
<_sre.SRE_Match object at 0x23cb1d0>
>>> rec.match('www.example.com')
<_sre.SRE_Match object at 0x23cb238>
>>> rec.match('www.example.com')
<_sre.SRE_Match object at 0x23cb1d0>
>>> rec.match('www.example.com')
<_sre.SRE_Match object at 0x23cb238>
>>> rec.match('www.example.com')
<_sre.SRE_Match object at 0x23cb1d0>

Edit:

As @kimvais answered, the reason lays in the _ which holds the latest assignment. see, if you not assigning, rather printing, it is the same one, all the time.

>>> print rec.match('www.example.com')
<_sre.SRE_Match object at 0x23cb1d0>
>>> print rec.match('www.example.com')
<_sre.SRE_Match object at 0x23cb1d0>
>>> print rec.match('www.example.com')
<_sre.SRE_Match object at 0x23cb1d0>
>>> print rec.match('www.example.com')
<_sre.SRE_Match object at 0x23cb1d0>
>>> print rec.match('www.example.com')
<_sre.SRE_Match object at 0x23cb1d0>
>>> print rec.match('www.example.com')
<_sre.SRE_Match object at 0x23cb1d0>
Community
  • 1
  • 1
Tzury Bar Yochay
  • 8,798
  • 5
  • 49
  • 73

2 Answers2

8

My guess is that this has something to do with the return value being assigned to underscore (_) internally in the interactive python shell - i.e. since _ is pointing to <_sre.SRE_Match object at 0x23cb238> 'til the next rec.match is completed the same local cannot be reused until _ points to somewhere else and the old one can be recycled.

Community
  • 1
  • 1
Kimvais
  • 38,306
  • 16
  • 108
  • 142
  • 1
    Just what I was about to suggest. +1. Try putting the commands in a script and running that; then the allocator actually reuses the address (on my Linux box at least). – Fred Foo Feb 06 '12 at 12:58
  • that *sounds* like the right answer, I'll be waiting for some more references before marking this answer as the **correct** one. – Tzury Bar Yochay Feb 06 '12 at 13:00
1

What you are seeing is an implementation detail. You actually had 6 unique instances of the <_sre.SRE_MATCH> object.

Since you made no explicit references to them, the garbage collector would free them in due time, allowing that same memory location to be re-used.

The 0x23cb1d0 is essentially the memory location of the object, not a GUID.

Try assigning these to a local variable, and you will see that since they are not garbage collected, new memory locations will be used for each instance.

gahooa
  • 131,293
  • 12
  • 98
  • 101