6

I want to create a new named column in a Pandas dataframe, insert first value into it, and then add another values to the same column:

Something like:

import pandas

df = pandas.DataFrame()
df['New column'].append('a')
df['New column'].append('b')
df['New column'].append('c')

etc.

How do I do that?

barciewicz
  • 3,511
  • 6
  • 32
  • 72
  • Possible duplicate of https://stackoverflow.com/questions/12555323/adding-new-column-to-existing-dataframe-in-python-pandas – r3zaxd1 Jul 24 '18 at 13:13

3 Answers3

9

If I understand, correctly you want to append a value to an existing column in a pandas data frame. The thing is with DFs you need to maintain a matrix-like shape so the number of rows is equal for each column what you can do is add a column with a default value and then update this value with

for index, row in df.iterrows():
     df.at[index, 'new_column'] = new_value
xxx
  • 1,153
  • 1
  • 11
  • 23
amo3tasem
  • 140
  • 6
7

Dont do it, because it's slow:

  1. updating an empty frame a-single-row-at-a-time. I have seen this method used WAY too much. It is by far the slowest. It is probably common place (and reasonably fast for some python structures), but a DataFrame does a fair number of checks on indexing, so this will always be very slow to update a row at a time. Much better to create new structures and concat.

Better to create a list of data and create DataFrame by contructor:

vals = ['a','b','c']

df = pandas.DataFrame({'New column':vals})
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

If in case you need to add random values to the newly created column, you could also use

df['new_column']= np.random.randint(1, 9, len(df))
myeongkil kim
  • 2,465
  • 4
  • 16
  • 22