Appending a list to a column in Pandas while copying the rest of the values

Question

I have a Pandas dataframe read from a CSV file that is structured like this:

x_column    y_column    number_column  
---         ----        ----
---         ----        ----
xxx         yyyy        1
xxx         yyyy        2
xxx         yyyy        35
xxx         yyyy        42

The row with the dashes represents some extra data at the start of the csv file that I want to keep.

I have a list of numbers I want to append to the 'number_column'. The list itself is 500,000 values long. I want to append the list to the column keeping the existing values for the number_column in the same place and un-altered.

I also want the values for x_column and y_column to be the same for every row that has just been added as shown in the example. My current approach is just a simple for loop that appends the values one at a time:

for num in number_list:

      data_df = data_df.append(pd.DataFrame({'x_column': 'xxx', 'y_column': 'yyy', 'number_column': num}, index=[0]), ignore_index=True)

My question is if there is a faster way of doing this? The current approach takes a long while to complete.

score 2 · Answer 1 · edited May 23 '17 at 12:24

2

Don't call data_df = data_df.append(...) in a loop since that leads to quadratic copying, which is very bad for performance. Instead, append to a list, build one DataFrame, then concatenate it to your original DataFrame:

tmp = pd.DataFrame({'x_column': 'xxx', 'y_column': 'yyy', 'number_column': number_list})
data_df = pd.concat([data_df, tmp])

edited May 23 '17 at 12:24

Community

1
1

answered Feb 04 '17 at 14:37

unutbu

842,883
184
1,785
1,677

Appending a list to a column in Pandas while copying the rest of the values

1 Answers1