I cannot figure out why a simple function:
def to_integer(value):
if value == "":
return None
return int(value)
changes values from str
to int
only if there's no empty string ""
in the dataframe, i.e. only if no value is to be returned as None
.
If I go:
type(to_integer('1')) == int
returns True.
Now, using apply
and to_integer
with df1
:
df1 = pd.DataFrame(['1', '2', '3'], columns=['integer'])
result = df1['integer'].apply(to_integer)
gives column of integers (np.int64
).
But if I apply it to this df2
:
df2 = pd.DataFrame(['1', '', '3'], columns=['integer'])
result = df2['integer'].apply(to_integer)
it returns a column of floats (np.float64
).
Isn't it possible to have a dataframe with integers and None
at the same time?
I use Python 3.3 and Pandas 0.12.