14

Pretty self-explanatory (I'm on Windows):

>>> import sys, numpy
>>> a = numpy.int_(sys.maxint)
>>> int(a).__class__
<type 'long'>
>>> int(int(a)).__class__
<type 'int'>

Why does calling int once give me a long, whereas calling it twice gives me an int?

Is this a bug or a feature?

user541686
  • 205,094
  • 128
  • 528
  • 886
  • Also gives me an int with Python 2.7.6 and numpy 1.8.2 on Ubuntu 14.04 LTS and Python 2.7.12 numpy 1.11.0 on Ubuntu 16.04 LTS. – marcolz May 24 '17 at 10:29
  • 2
    The results I mentioned were on Linux. I just tested on Windows with Numpy 1.12.1 and I do reproduce your results. For 2147483646 it gives an `int`, but for 2147483647 it's a `long`. – interjay May 24 '17 at 10:31
  • Related: [python handles long ints differently on Windows and Unix](https://stackoverflow.com/q/22513445/846892) – Ashwini Chaudhary May 24 '17 at 10:34
  • I get the same off-by-one error on Linux with `numpy.int64(9223372036854775807)`, which is converted to `long` while 1 less than that is converted to `int`. – interjay May 24 '17 at 10:38
  • @AshwiniChaudhary: Not really related to the question; see my edit. – user541686 May 24 '17 at 10:39
  • Explore `a.item()` and `int(sys.maxint)`. – hpaulj May 24 '17 at 10:59
  • @hpaulj: Huh, I didn't even know `item()` exists. Thanks for the tip! – user541686 May 24 '17 at 11:01

2 Answers2

4

As proposed in the (now-deleted) other answer, this does seem to be a bug due to an incorrect use of < instead of <=, but it's not coming from the code cited in the other answer. That code is part of the printing logic, which isn't involved here.

I believe the code that handles int calls on NumPy scalars is generated from a template in numpy/core/src/umath/scalarmath.c.src, the relevant part for signed integer dtypes being

    if(LONG_MIN < x && x < LONG_MAX)
        return PyInt_FromLong(x);
#endif
    return @func@(x);

For integers strictly between LONG_MIN and LONG_MAX, this code produces an int. For an integer with value LONG_MAX, it falls back on the return @func@(x); case, where @func@ is substituted by an appropriate function from the PyLongFrom* family by the template engine.

Thus, calling int on a NumPy int with value LONG_MAX produces a long, but since the result is representable as an int, calling int on the result again produces an int.

user2357112
  • 260,549
  • 28
  • 431
  • 505
4

This question is specific to Numpy and Python 2. In Python 3 there are no separate int and long types.

The behaviour happens due to an off-by-one error in numpy. int(x) with one argument converts x to number by calling PyNumber_Int(x). PyNumber_Int then specifically takes the path for int subclasses, as int64 returned by numpy.int_ is a subclass of int:

m = o->ob_type->tp_as_number;
if (m && m->nb_int) { /* This should include subclasses of int */
    /* Classic classes always take this branch. */
    PyObject *res = m->nb_int(o);
    if (res && (!PyInt_Check(res) && !PyLong_Check(res))) {
        PyErr_Format(PyExc_TypeError,
                     "__int__ returned non-int (type %.200s)",
                     res->ob_type->tp_name);
        Py_DECREF(res);
        return NULL;
    }
    return res;
}

Now, for this code calls a->ob_type->tp_as_number->nb_int, which is implemented in numpy/core/src/umath/scalarmath.c.src. This is the location for code that is parametrized for different types; this one for <typename>_int method that is used to fill the nb_int method slot. It has the following off-by one if there:

if(LONG_MIN < x && x < LONG_MAX)
    return PyInt_FromLong(x);

both operators should be <= instead. With < there, neither LONG_MIN nor LONG_MAX pass the condition and they're instead are converted into a PyLong at line 1432:

return @func@(x);

with @func@ being replaced by PyLong_FromLongLong in the case of int_. Thus, long(sys.maxint) is returned.

Now, as the sys.maxint is still representable by int, int(long(sys.maxint)) returns an int; likewise int(sys.maxint + 1) returns a long.

user541686
  • 205,094
  • 128
  • 528
  • 886