When I insert pandas Series into dataframe, all values become NaN

Question

I have a pandas series that looks like this:

>>> myseries 2012-01-01 15:20:00-05:00 2 2012-01-01 15:30:00-05:00 1 2012-01-01 15:40:00-05:00 0...

And I try to put it into a dataframe as so:

>>> mydf = pd.DataFrame(myseries, columns=["myseries"], index = myseries.index)

and all the values become NaN for some reason:

>>> mydf 2012-01-01 15:20:00-05:00 NaN 2012-01-01 15:30:00-05:00 NaN 2012-01-01 15:40:00-05:00 NaN

I'm pretty confused. This seems like a really simple application. What am I doing wrong? By the way, replacing with pd.DataFrame(myseries.values, columns=...) fixes the problem, but why is it necessary? Thank you.

Can you post the df you are using? Initializing a DataFrame using a Series works for me. — Alex, Mar 09 '15 at 03:57
I can't post all the data if that's what you mean.. it's 200,000 rows. Its type is `` — user, Mar 09 '15 at 05:50
If you create the df without specifying the index, and then redefine the index, does it work? — cphlewis, Mar 09 '15 at 08:35
It does if I don't specify a column name, but I need to do that because the name needs to change. At that point, I must also define indexes to not get an empty dataframe. — user, Mar 09 '15 at 16:06
Possible duplicate of [Adding new column to existing DataFrame in Python pandas](http://stackoverflow.com/questions/12555323/adding-new-column-to-existing-dataframe-in-python-pandas) — thleo, Mar 30 '17 at 14:57

score 1 · Answer 1 · answered Mar 09 '15 at 04:50

1

Even simpler:

s = pd.Series([0,1,2,3], index=pd.date_range('2014-01-01', periods=4), name='s')
df = pd.DataFrame(s)
print(df)

yields

            s
2014-01-01  0
2014-01-02  1
2014-01-03  2
2014-01-04  3

answered Mar 09 '15 at 04:50

Alexander

105,104
32
201
196

Yeah, I agree that works, but I guess I should have expanded a little. The initial column name of myseries needs to be dropped, and the new column name in my dataframe must be the name myseries. This is because I will subsequently fill in other columns as I calculate them. And once you specify columns=[], you seem to have to also specify index to get a non-null dataframe. – user Mar 09 '15 at 05:44
1

This is a simpler way to do it, but the OPs question is about why `mydf = pd.DataFrame(myseries, columns=["myseries"], index = myseries.index)` doesn't work – Alex Mar 09 '15 at 15:23

score 0 · Answer 2 · answered Mar 09 '15 at 04:01

0

s = pd.Series([0,1,2,3], index=pd.date_range('2014-01-01', periods=4))
df = pd.DataFrame(s, columns=['s'], index=s.index)
print(df)

yields

            s
2014-01-01  0
2014-01-02  1
2014-01-03  2
2014-01-04  3

answered Mar 09 '15 at 04:01

Alex

18,484
8
60
80

This works fine for me too, but my own data doesn't, unless I append the ".values". It's a mystery to me why. – user Mar 09 '15 at 05:47
What are the `dtypes` of the index and the values? – Alex Mar 09 '15 at 15:25
"dtype" is not allowed but "type" is ``. Values are `` and indexes are ``. – user Mar 09 '15 at 15:53
And, if something seems strange to you about that index, please see a workaround for an earlier line of the code I posted [here](http://stackoverflow.com/questions/28910231/failing-to-convert-pandas-dataframe-timestamp). – user Mar 09 '15 at 16:02
Not sure how to help unless you post some reproducible code. – Alex Mar 09 '15 at 17:52
I understand. The thing is, I can't reproduce it with a sample case either. And I can't post 3 GB of data. This example above uses a series with all the same characteristics as mine, as far as I can tell, but results in different behavior when entering a dataframe. I'm stumped. – user Mar 09 '15 at 18:04
What do you mean dtype is not allowed? What are the results from `print(myseries.dtype, myseries.index.dtype)`? – Alex Mar 09 '15 at 18:07
Ah, I was trying `dtype(myseries.values)`. For your line, I get `(dtype('float64'), dtype(' – user Mar 09 '15 at 18:09

When I insert pandas Series into dataframe, all values become NaN

2 Answers2

Linked