74

I have some data I'm trying to organize into a DataFrame in Pandas. I was trying to make each row a Series and append it to the DataFrame. I found a way to do it by appending the Series to an empty list and then converting the list of Series to a DataFrame

e.g. DF = DataFrame([series1,series2],columns=series1.index)

This list to DataFrame step seems to be excessive. I've checked out a few examples on here but none of the Series preserved the Index labels from the Series to use them as column labels.

My long way where columns are id_names and rows are type_names: enter image description here

Is it possible to append Series to rows of DataFrame without making a list first?

#!/usr/bin/python

DF = DataFrame()
for sample,data in D_sample_data.items():
    SR_row = pd.Series(data.D_key_value)
    DF.append(SR_row)
DF.head()

TypeError: Can only append a Series if ignore_index=True or if the Series has a name

Then I tried

DF = DataFrame()
for sample,data in D_sample_data.items():
    SR_row = pd.Series(data.D_key_value,name=sample)
    DF.append(SR_row)
DF.head()

Empty DataFrame

Tried Insert a row to pandas dataframe Still getting an empty dataframe :/

I am trying to get the Series to be the rows, where the index of the Series becomes the column labels of the DataFrame

Community
  • 1
  • 1
O.rka
  • 29,847
  • 68
  • 194
  • 309
  • I'm trying to add rows. The index of the Series should be the columns of the DataFrame. So rows would be samples and columns would be features. – O.rka Oct 13 '15 at 04:26
  • Did you try adding a name to the Series, as the error message suggests? – BrenBarn Oct 13 '15 at 04:29
  • You need to read the error message. It tells you to add a name to the Series, or use `ignore_index=True`. If you do either of those, it works fine. – BrenBarn Oct 13 '15 at 04:45
  • There is no error message, it just gives me an empty dataframe – O.rka Oct 13 '15 at 04:58

7 Answers7

95

Maybe an easier way would be to add the pandas.Series into the pandas.DataFrame with ignore_index=True argument to DataFrame.append(). Example -

DF = DataFrame()
for sample,data in D_sample_data.items():
    SR_row = pd.Series(data.D_key_value)
    DF = DF.append(SR_row,ignore_index=True)

Demo -

In [1]: import pandas as pd

In [2]: df = pd.DataFrame([[1,2],[3,4]],columns=['A','B'])

In [3]: df
Out[3]:
   A  B
0  1  2
1  3  4

In [5]: s = pd.Series([5,6],index=['A','B'])

In [6]: s
Out[6]:
A    5
B    6
dtype: int64

In [36]: df.append(s,ignore_index=True)
Out[36]:
   A  B
0  1  2
1  3  4
2  5  6

Another issue in your code is that DataFrame.append() is not in-place, it returns the appended dataframe, you would need to assign it back to your original dataframe for it to work. Example -

DF = DF.append(SR_row,ignore_index=True)

To preserve the labels, you can use your solution to include name for the series along with assigning the appended DataFrame back to DF. Example -

DF = DataFrame()
for sample,data in D_sample_data.items():
    SR_row = pd.Series(data.D_key_value,name=sample)
    DF = DF.append(SR_row)
DF.head()
Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176
  • I saw that on "Insert a row to pandas dataframe" link above. I'm trying to mess around with it. Maybe there is something that I'm not doing correctly. – O.rka Oct 13 '15 at 04:55
  • 7
    Ah man, thanks! I didn't catch the DF = DF.append() That's way different than list appending. Sorry I missed that. – O.rka Oct 13 '15 at 05:10
  • I lost the Index labels. Is there any way to preserve this? – O.rka Oct 13 '15 at 05:12
  • 2
    you can use your `name` solution with `DF = DF.append(SR_row)` . Updated the answer with that example. – Anand S Kumar Oct 13 '15 at 05:12
  • 5
    **Warning**: df.append is now deprecated and you should try using pd.concat insted – David Davó Feb 28 '22 at 12:10
30

DataFrame.append does not modify the DataFrame in place. You need to do df = df.append(...) if you want to reassign it back to the original variable.

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
BrenBarn
  • 242,874
  • 37
  • 412
  • 384
  • 2
    This is a deviation from python normal behavior and is worthwhile to always keep in mind. – Adnan Y Jun 10 '20 at 00:30
  • Using `df.append` is deprecated since [pandas 1.4](https://pandas.pydata.org/docs/whatsnew/v1.4.0.html#deprecated-dataframe-append-and-series-append) and should be replaced by [`pd.concat`](https://pandas.pydata.org/docs/reference/api/pandas.concat.html) – ascripter Aug 29 '23 at 07:56
13

Something like this could work...

mydf.loc['newindex'] = myseries

Here is an example where I used it...

stats = df[['bp_prob', 'ICD9_prob', 'meds_prob', 'regex_prob']].describe()

stats
Out[32]: 
          bp_prob   ICD9_prob   meds_prob  regex_prob
count  171.000000  171.000000  171.000000  171.000000
mean     0.179946    0.059071    0.067020    0.126812
std      0.271546    0.142681    0.152560    0.207014
min      0.000000    0.000000    0.000000    0.000000
25%      0.000000    0.000000    0.000000    0.000000
50%      0.000000    0.000000    0.000000    0.013116
75%      0.309019    0.065248    0.066667    0.192954
max      1.000000    1.000000    1.000000    1.000000

medians = df[['bp_prob', 'ICD9_prob', 'meds_prob', 'regex_prob']].median()

stats.loc['median'] = medians

stats
Out[36]: 
           bp_prob   ICD9_prob   meds_prob  regex_prob
count   171.000000  171.000000  171.000000  171.000000
mean      0.179946    0.059071    0.067020    0.126812
std       0.271546    0.142681    0.152560    0.207014
min       0.000000    0.000000    0.000000    0.000000
25%       0.000000    0.000000    0.000000    0.000000
50%       0.000000    0.000000    0.000000    0.013116
75%       0.309019    0.065248    0.066667    0.192954
max       1.000000    1.000000    1.000000    1.000000
median    0.000000    0.000000    0.000000    0.013116
Selah
  • 7,728
  • 9
  • 48
  • 60
11

append is deprecating so, the best choice would be to_frame().T

df1 = pd.DataFrame({'name':['john','mark'],'job':['manager','salesman'],'age':[43,23]})
ser1 = df1.iloc[-1]
pd.concat([df1,ser1.to_frame().T],ignore_index=True)

   name       job age
0  john   manager  43
1  mark  salesman  23
2  mark  salesman  23
ferrabras
  • 111
  • 1
  • 2
9

Convert the series to a dataframe and transpose it, then append normally.

srs = srs.to_frame().T
df = df.append(srs)
tmldwn
  • 435
  • 4
  • 13
4

Try using this command. See the example given below:

Before image

df.loc[len(df)] = ['Product 9',99,9.99,8.88,1.11]

df

After Image

Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135
1

This would work as well:

df = pd.DataFrame()
new_line = pd.Series({'A2M': 4.059, 'A2ML1': 4.28}, name='HCC1419')
df = df.append(new_line, ignore_index=False)

The name in the Series will be the index in the dataframe. ignore_index=False is the important flag in this case.