4

I have a Series with integer entries, but also some null entries. It is represented as a Series with dtype=float64. I would like to convert it to a Series with dtype=object, where the integer entries are stored as Python ints and the null entries are stored as np.nans.

I have two attempts below. The first doesn't work, as the int is (unexpectedly?) still converted to a float. The second works as I would hope.

s = pd.Series([1, np.nan])

s = s.astype(object)
i = s.notnull()
s[i] = s[i].astype(int)

type(s[0])

Above snippet returns float. :(

s = pd.Series([1, np.nan])

s = s.astype(object)
i = s.notnull()
s[i] = list(s[i].astype(int))

type(s[0])

Above snippet returns int. :)

Why does the first example not work, even though the Series has dtype=object? Converting to a list seems like a really weird hack to get this to work, but I couldn't find any other way to do it.

Is there a simpler way to do this in Pandas?

AJ Friend
  • 703
  • 1
  • 7
  • 16
  • 1
    I'm aware of the issue with `int` and `NaN`, and that `int`s typically get cast to `float` to handle missing values. I was hoping to get around it by having the dytpe be `object`. The second example actually does give me what I want; I haven't removed any entries. The second example's result is equivalent to `pd.Series([1, np.nan], dtype=object)`. Edit: Parent I was responding to removed their comment. – AJ Friend Sep 07 '17 at 01:27
  • 1
    `s.loc[i] = s[i].astype(int)` also works. Not sure why though. – ayhan Sep 07 '17 at 01:29

1 Answers1

1

Regarding whether or not there is a simpler way to do this in Pandas, as of version 0.24 (January 2019), you can use nullable integers in cases where you have Series with integer values and missing data:

In [120]: s.astype('Int64')
Out[120]:
0      1
1    NaN
dtype: Int64

In [121]: type(s.astype('Int64')[0])
Out[121]: numpy.int64

In [122]: type(s.astype('Int64')[1])
Out[122]: float
fuglede
  • 17,388
  • 2
  • 54
  • 99
  • 2
    @jpp the answers are not identical, the op has made an effort to tailor each answer. Instead you can close vote the questions as duplicates. You also only needed to post one comment, not replicate it under every question. It borders on harassment. –  Jan 27 '19 at 02:43
  • @YvetteColomb, Fair enough, I've close-voted these to a canonical which already deals with older versions, will ping fuglede to put a detailed answer on that post. Since this functionality is experimental, we *hope* API doesn't change and, if it does, the user will happily change their half-dozen posts. – jpp Jan 27 '19 at 03:46
  • @fuglede, I've close-voted several posts as duplicates of [this canonical](https://stackoverflow.com/questions/11548005/numpy-or-pandas-keeping-array-type-as-integer-while-having-a-nan-value). Please do post an answer on that duplicate target. In particular, [this one](https://stackoverflow.com/a/54380567/9209546) has good detail and should attract more attention on the canonical. – jpp Jan 27 '19 at 03:48
  • @jpp I'm not sure if you're aware, but posting any comment under a post gives the OP of that post a notification. Also if the API changes the answers do not necessarily need to be updated, there can be a new updated post made. The site is not set in stone. –  Jan 27 '19 at 05:10
  • @YvetteColomb, I'm aware. But I find it slightly odd (and counterproductive) to write a post on a new feature on every possible Q&A *except* the canonical which is the one people are going to reach/read. – jpp Jan 27 '19 at 10:42
  • I actually linked to [a different one as being canonical](https://stackoverflow.com/q/21287624/5085211) (but it looks like that comment was deleted?). That one already had a relevant answer but so did the one you mention. FWIW, those two do look like duplicates to me (where I'd still hold that the others didn't; for instance, this one asks two different questions, where IntegerNA is only really relevant for the second one). Anyway, the logic was that people who came across these other questions would also want to know about IntegerNA. Isn't that productive enough? (And if not, why not?) – fuglede Jan 27 '19 at 11:43