0

Background: I am trying to create a Y-axis of 0s when there is Exit Date and 1s when there is both Investment and Exit Date and -1 when both are missing or anything else. I want to see the updates for 88k rows.

Here's my code:

for i in range(0,len(merged_port_cos_firms)):
    inv = merged_port_cos_firms.iloc[i]['Investment_Date']
    exi = merged_port_cos_firms.iloc[i]['Exit_Date']
    print(i, len(exits), inv, exi, end = ' | ')
    if not pd.isnull(exi):
        exits.append(1)
        print('Exited')
    elif not pd.isnull(inv):
        exits.append(0)
        print('Not Exited')
    else:
        exits.append(-1)
        print('Missing')
    if i  is not len(exits)-1:
        print('Mismatch : ',i,len(exits)-1)
#         break

And the output at around 257:

251 251 NaT NaT | Missing
252 252 NaT NaT | Missing
253 253 NaT NaT | Missing
254 254 NaT 2002-11-01 00:00:00 | Exited
255 255 NaT NaT | Missing
256 256 NaT NaT | Missing
257 257 NaT NaT | Missing
Mismatch :  257 257
258 258 NaT NaT | Missing
Mismatch :  258 258 

All the other if blocks are executed multiple times before this. I have to get the exits for 88k rows, so I am running the code without break. It is always entering the last if block only after 257. I don't understand why it is happening. Doesn't make any sense.

abhishah901
  • 539
  • 1
  • 8
  • 16
  • 2
    have you tried using `!=` instead of `is not`? You really shouldn't be using `is` to compare integers (or strings, or lists, or really anything unless you know exactly what you're doing), and it's possible it's failing on sufficiently large inputs for whatever reason – Green Cloak Guy Nov 01 '19 at 04:34
  • @GreenCloakGuy I'll try with ```!=``` , however, why would it work in cases before 256? Will have to look up what do you mean by >> failing on sufficiently large inputs – abhishah901 Nov 01 '19 at 04:52
  • 1
    The fact that an 8-bit number corresponds to 256 possibilities is probably relevant, but the real answer would be in the source of the Python implementation, and you probably don't want to go in there. Just don't use `is` for comparing ints. – Samwise Nov 01 '19 at 05:09
  • @abhishah901 my guess is that your python would be substituting literals in for values less than 256, but when they exceed 256 would be allocating an actual variable, at which point the behavior of `is` would change (because `is` is about object identity, not about value). Even if I'm right, and I have no evidence for that, it would almost certainly be implementation-specific behavior that might not be replicable on other operating systems or versions of python or whatever. – Green Cloak Guy Nov 01 '19 at 18:18
  • Possible duplicate of [What's with the integer cache maintained by the interpreter?](https://stackoverflow.com/questions/15171695/whats-with-the-integer-cache-maintained-by-the-interpreter) – snakecharmerb Nov 02 '19 at 09:57

0 Answers0