Good alternative to Pandas .append() method, now that it is being deprecated?

Question

I use the following method a lot to append a single row to a dataframe. One thing I really like about it is that it allows you to append a simple dict object. For example:

# Creating an empty dataframe
df = pd.DataFrame(columns=['a', 'b'])

# Appending a row
df = df.append({ 'a': 1, 'b': 2 }, ignore_index=True)

Again, what I like most about this is that the code is very clean and requires very few lines. Now I suppose the recommended alternative is:

# Create the new row as its own dataframe
df_new_row = pd.DataFrame({ 'a': [1], 'b': [2] })
df = pd.concat([df, df_new_row])

So what was one line of code before is now two lines with a throwaway variable and extra cruft where I create the new dataframe. :( Is there a good way to do this that just uses a dict like I have in the past (that is not deprecated)?

[pandas issue 35407](https://github.com/pandas-dev/pandas/issues/35407) explains that `df.append` was deprecated because: "Series.append and DataFrame.append [are] making an analogy to list.append, but it's a poor analogy since the behavior isn't (and can't be) in place. The data for the index and values needs to be copied to create the result." — Paul Rougieux, Feb 10 '22 at 12:15
Came across this warning today. However when I used concat as the alternative I got "cannot concatenate object of type ''; only Series and DataFrame objs are valid". So frustrating..... — Ben Watson, Feb 13 '22 at 23:16

score 60 · Accepted Answer · answered Jan 24 '22 at 16:57

60

Create a list with your dictionaries, if they are needed, and then create a new dataframe with df = pd.DataFrame.from_records(your_list). List's "append" method are very efficient and won't be ever deprecated. Dataframes on the other hand, frequently have to be recreated and all data copied over on appends, due to their design - that is why they deprecated the method

answered Jan 24 '22 at 16:57

jsbueno

99,910
10
151
209

2

How do you know that it is deprecated? At https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.append.html#pandas.DataFrame.append (which currently shows version 1.4.0) I don't see any mention about that. Even at the dev tree I don't see any deprecation warning: https://pandas.pydata.org/docs/dev/reference/api/pandas.DataFrame.append.html – zby Feb 02 '22 at 10:55
7

I agree ; though when you use append method (with 1.4.0) you run into a "FutureWarning : The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead". You will find the details in the ["what's new" page](https://pandas.pydata.org/docs/dev/whatsnew/v1.4.0.html#deprecated-frame-append-and-series-append) – tgrandje Feb 04 '22 at 13:20
1

@zby the update to the documentation is being dealt with in this pull request: https://github.com/pandas-dev/pandas/pull/45587 – Paul Rougieux Feb 10 '22 at 12:21
this brought a ten times faster speed to my code , thanks so much man – kush May 07 '22 at 13:37
2

That is actually the reason they are deprecating `df.append`. Thank the Pandas maintainers for that. Still, the "new way to do it" should be more proeminent in their docs, for sure. – jsbueno May 07 '22 at 16:50
creating a dataframe from a huge list took too much time compared to getting data and appending in chunks to the dataframe. However the pd.concat method below worked just fine – Alex Punnen Jun 28 '22 at 05:30
1

I heard append() will insert dictionaries into their correct corresponding dataframe columns for an existing dataframe with named columns. Given that, will from_records() be able to do the same? – starmandeluxe Jul 30 '22 at 09:18

Rafael Gaitan · Answer 2 · 2022-03-08T13:51:39.613

50

I also like the append method. But you can do it in one line with a list of dicts

df = pd.concat([df, pd.DataFrame.from_records([{ 'a': 1, 'b': 2 }])])

or using loc and tuples for values on DataFrames with incremenal ascending indexes

df.loc[len(df), ['a','b']] = 1, 2

or maybe

df.loc[len(df), df.columns] = 3, 4

edited Mar 08 '22 at 13:51

answered Mar 08 '22 at 13:45

Rafael Gaitan

601
4
5

7

You can also use ignore_index `df = pd.concat([df, pd.DataFrame.from_records([{ 'a': 1, 'b': 2 }])], ignore_index=True)` – Rafael Gaitan Apr 22 '22 at 18:50
first argument must be an iterable of pandas objects, you passed an object of type "DataFrame". What can I do to solve this? – Helen Kapatsa Nov 28 '22 at 20:07
The arguments are actually inside a list. So the first argument is a list of pandas objects. First item is the original df, second item is the new df generated from records. – Rafael Gaitan Nov 29 '22 at 20:47
1

I guess this is the best answer out here, but intuitively, this feels wrong because the old dataframe is still overwritten with the new one. Appending in a sense should have greater data security. – Rivered Dec 04 '22 at 22:36

score 30 · Answer 3 · edited Apr 15 '23 at 13:17

30

If you want to use concat instead:

append

outputxlsx = outputxlsx.append(df, ignore_index=True)

concat

outputxlsx = pd.concat([outputxlsx, df], ignore_index=True)

edited Apr 15 '23 at 13:17

afroditi

307
1
3
13

answered Jun 02 '22 at 19:55

beltalowda

311
3
5

5

`outputxlsx = pd.concat([outputxlsx, df])` is enough since `df` is a data frame. – Paul Rougieux Jun 14 '22 at 19:09

Nico · Answer 4 · 2022-09-15T10:17:55.707

I was facing a similar issue. The other solutions weren't really working for me. I'm leaving this answer here as an additional possibility to deal with the issue since this is the first google result for certain searches and I myself ended here at least for the second time.

In my case the data is not a dict but just a list of values for a known set of parameters. I want to add the parameter values to a dataframe as rows because this way I can access a series of all the values for one parameter via df[parameter].

I start with an empty DataFrame:

parameters = ['a', 'b', 'c', 'd', 'e', 'f']
df = pd.DataFrame(columns=parameters)

df:

        a   b   c   d   e   f

With append I could add rows very convenient like so:

new_row = pd.Series([1,2,3,4,5,6], index=parameters, name='row1')
df.append(new_row)

df:

        a   b   c   d   e   f
row1    1   2   3   4   5   6

With pd.concat I found this to deliver the same result in very similar way:

new_row = pd.DataFrame([1,2,3,4,5,6], columns=['row1'], index=parameters).T
df = pd.concat((df, new_row))

The key was to create a (1,n) dataframe from the 1d data and then transpose it to match the other dataframe.

Or you could create a dictionary out of your list: `new = {k: v for k, v in zip(parameters, [1,2,3,4,5,6])}` then `df = pd.concat([df, pd.DataFrame(new, index=['row1'])])` works — Omar AlSuwaidi, Sep 21 '22 at 06:30

score 7 · Answer 5 · answered Aug 31 '22 at 12:46

For those, like me, who want a descriptive function rather than lots of one-liners, here is an option based on @Rafael Gaitan above.

def appendDictToDF(df,dictToAppend):
  df = pd.concat([df, pd.DataFrame.from_records([dictToAppend])])
  return df

# Creating an empty dataframe
df = pd.DataFrame(columns=['a', 'b'])

# Appending a row
df= appendDictToDF(df,{ 'a': 1, 'b': 2 })

score 2 · Answer 6 · answered Aug 10 '22 at 09:07

# Deprecated issue has been resolved

# Creating an empty dataframe
df = pd.DataFrame(columns=['a', 'b'])
print("df columns:", df)

# Appending a row
df = df.append({ 'a': 1, 'b': 2 }, ignore_index=True)
print("df column Values :", df)

# Create the new row as its own dataframe
df_new_row = pd.DataFrame.from_records({ 'a': [3], 'b': [4] })
df = pd.concat([df, df_new_row])
print("pd concat with two df's :", df)

score 1 · Answer 7 · answered Apr 11 '23 at 15:36

You can use the following command

#If your index is a string
df.loc["name of the index"] = pd.Series({"Column 1" : Value1, "Column 2" : Value2,
"Column 3" : Value3, "Column 4" : Value4, ...})

#If your index is a number
df.loc[len(df)] = pd.Series({"Column 1" : Value1, "Column 2" : Value2,
"Column 3" : Value3, "Column 4" : Value4, ...})

Just keep in mind that the changes will be stored in the initial dataframe.

jrosell · Answer 8 · 2023-07-28T13:53:32.077

If you want to use chained operations in pandas to append new rows, you could use something like this:

import pandas as pd

new_row = pd.DataFrame({"sex": "male", "age": 40, "survived": False, "name": "Alex"}, index=[0])

(df_dict
  .assign(name=['Alice', 'Bob', 'Charlie'])
  .drop("pclass", axis=1)
  .pipe(lambda df_: pd.concat([df_, new_row], ignore_index = True))
)

For your example, I would use a custom append_row function.

import pandas as pd

def append_row(df1, d):
    df2 = pd.DataFrame(d, index=[0])
    return pd.concat([df1, df2], ignore_index = True)

df = pd.DataFrame(columns=['a', 'b'])
(df
    .pipe(append_row, {'a': 1, 'b': 2 })
)

Good alternative to Pandas .append() method, now that it is being deprecated?

8 Answers8

Linked

Related