0

I am having trouble using pandas dataframe.append() as it doesn't work the way it is described in the the help(pandas.DataFrame.append), or online in the various sites, blogs, answered questions etc.

This is exactly what I am doing

import pandas as pd
import numpy as np
dataset = pd.DataFrame.from_dict({"0": [0,0,0,0]}, orient="index", columns=["time", "cost", "mult", "class"])
row= [3, 1, 3, 1]
dataset = dataset.append(row, sort=True )

Trying to get to this result

    time   cost  mult  class
0    0.0   0.0   0.0   0.0
1     1     1     1     1

what I am getting instead is

    0    class  cost  mult  time
0  NaN    0.0   0.0   0.0   0.0
0  3.0    NaN   NaN   NaN   NaN
1  1.0    NaN   NaN   NaN   NaN
2  3.0    NaN   NaN   NaN   NaN
3  1.0    NaN   NaN   NaN   NaN

I have tried all sorts of things, but some examples (online and in documentation) can't be done since .append() doesn't uses anymore the parameter "columns"

append(self, other, ignore_index: 'bool' = False, verify_integrity: 'bool' = False, sort: 'bool' = False) -> 'DataFrame'

Append rows of other to the end of caller, returning a new object. other : DataFrame or Series/dict-like object, or list of these The data to append.

ignore_index : bool, default False If True, the resulting axis will be labeled 0, 1, …, n - 1.

verify_integrity : bool, default False If True, raise ValueError on creating index with duplicates.

sort : bool, default False Sort columns if the columns of self and other are not aligned.

I have tried all combinations of those parameter but it keeps showing me that crap of new rows with values on a new separated columns, moreover it changes the order of the columns that I defined in the initial dataset. (I have tried also various things with .concat but it still gave similar problems wven with axis=0)

Since even the examples in the documentaition don't show this result while having the same code structure, if anyone could enlighten me on what is happening and why, and how to fix this, it would be great.

In response to the answer, I had already tried

row= pd.Series([3, 1, 3, 1])
row = row.to_frame()
dataset = dataset.append(row, ignore_index=True )
     0  class  cost  mult  time
0  NaN    0.0   0.0   0.0   0.0
1  3.0    NaN   NaN   NaN   NaN
2  1.0    NaN   NaN   NaN   NaN
3  3.0    NaN   NaN   NaN   NaN
4  1.0    NaN   NaN   NaN   NaN

alternatively

row= pd.Series([3, 1, 3, 1])
dataset = dataset.append(row, ignore_index=True )

   time  cost  mult  class    0    1    2    3
0   0.0   0.0   0.0    0.0  NaN  NaN  NaN  NaN
1   NaN   NaN   NaN    NaN  3.0  1.0  3.0  1.0

without the ingore_index raises this error in this second case

TypeError: Can only append a Series if ignore_index=True or if the Series has a name

1 Answers1

2

One option is to just explicitly turn the list into a pd.Series:

In [46]: dataset.append(pd.Series(row, index=dataset.columns), ignore_index=True)
Out[46]:
   time  cost  mult  class
0     0     0     0      0
1     3     1     3      1

You can also do it natively with a dict:

In [47]: dataset.append(dict(zip(dataset.columns, row)), ignore_index=True)
Out[47]:
   time  cost  mult  class
0     0     0     0      0
1     3     1     3      1

The issue you're having is that other needs to be a DataFrame, a Series (or another dict-like object), or a list of DataFrames or Serieses, not a list of integers.

Randy
  • 14,349
  • 2
  • 36
  • 42