How to add columns to an empty pandas dataframe?

Question

I have an empty dataframe.

df=pd.DataFrame(columns=['a'])

for some reason I want to generate df2, another empty dataframe, with two columns 'a' and 'b'.

If I do

df.columns=df.columns+'b'

it does not work (I get the columns renamed to 'ab') and neither does the following

df.columns=df.columns.tolist()+['b']

How to add a separate column 'b' to df, and df.emtpy keep on being True?

Using .loc is also not possible

   df.loc[:,'b']=None

as it returns

  Cannot set dataframe with no defined index and a scalar

actually it does. but why is '' not adding one element to the index then? and empty string is still a string — 00__00__00, May 16 '18 at 13:33
This is something I have been wondering myself...sorry but I don't know the answer! — famargar, May 16 '18 at 13:34
related: https://stackoverflow.com/questions/30926670/pandas-add-multiple-empty-columns-to-dataframe — EdChum, May 16 '18 at 13:44

Sumit Jha · Accepted Answer · 2018-05-16T13:57:46.477

48

Here are few ways to add an empty column to an empty dataframe:

df=pd.DataFrame(columns=['a'])
df['b'] = None
df = df.assign(c=None)
df = df.assign(d=df['a'])
df['e'] = pd.Series(index=df.index)   
df = pd.concat([df,pd.DataFrame(columns=list('f'))])
print(df)

Output:

Empty DataFrame
Columns: [a, b, c, d, e, f]
Index: []

I hope it helps.

edited May 16 '18 at 13:57

answered May 16 '18 at 13:49

Sumit Jha

1,601
11
18

1

See also `df2 = df.join(pd.DataFrame(columns=['b']))` as per answer below. – MrR May 26 '21 at 22:46

Ben.T · Answer 2 · 2018-05-16T14:02:23.133

19

If you just do df['b'] = None then df.empty is still True and df is:

Empty DataFrame
Columns: [a, b]
Index: []

EDIT: To create an empty df2 from the columns of df and adding new columns, you can do:

df2 = pd.DataFrame(columns = df.columns.tolist() + ['b', 'c', 'd'])

edited May 16 '18 at 14:02

answered May 16 '18 at 13:39

Ben.T

29,160
6
32
54

ALollz · Answer 3 · 2018-05-16T13:55:51.873

8

If you want to add multiple columns at the same time you can also reindex.

new_cols = ['c', 'd', 'e', 'f', 'g']
df2 = df.reindex(df.columns.union(new_cols), axis=1)

#Empty DataFrame
#Columns: [a, c, d, e, f, g]
#Index: []

edited May 16 '18 at 13:55

answered May 16 '18 at 13:42

ALollz

57,915
7
66
89

Yeah, I like `union` better. It avoids the possibility of having two similarly named columns in the `df` – ALollz May 16 '18 at 13:52
@piRSquared I think maybe using concat can conbine the `reindex` and `union` – BENY May 16 '18 at 14:06
@Wen I'm sure you're right. However, that requires constructing a new dataframe simply to concat. I tend to avoid constructing new pandas objects if it isn't necessary. – piRSquared May 16 '18 at 14:09

jpp · Answer 4 · 2021-05-23T11:58:43.460

6

This is one way:

df2 = df.join(pd.DataFrame(columns=['b']))

The advantage of this method is you can add an arbitrary number of columns without explicit loops.

In addition, this satisfies your requirement of df.empty evaluating to True if no data exists.

edited May 23 '21 at 11:58

answered May 16 '18 at 13:42

jpp

159,742
34
281
339

Why do you have to copy? – MrR May 21 '21 at 22:37
@MrR, the question states: `for some reason I want to generate df2, another empty dataframe,`. – jpp May 22 '21 at 08:11
`df2 = df.join(pd.DataFrame(columns=['b']))` is sufficient. No need for `df2 = df.copy()` – MrR May 22 '21 at 18:10
Upvoted. PS: This should be added to the first answer - it's missing from that nice compendium presented there, and it's one of the most elegant ways (if not the most elegant). – MrR May 26 '21 at 22:43

score 4 · Answer 5 · edited May 22 '21 at 05:24

4

You can use concat:

df=pd.DataFrame(columns=['a'])
df
Out[568]: 
Empty DataFrame
Columns: [a]
Index: []

df2=pd.DataFrame(columns=['b', 'c', 'd'])
pd.concat([df,df2])
Out[571]: 
Empty DataFrame
Columns: [a, b, c, d]
Index: []

edited May 22 '21 at 05:24

MrR

411
5
12

answered May 16 '18 at 14:05

BENY

317,841
20
164
234

How to add columns to an empty pandas dataframe?

5 Answers5