I have a large data-frame about 160k rows by 24 columns. I also have a pandas series of length 26 that I would like to add row-wise to my data-frame to make a final data-frame that is 160k rows by 50 columns, but my code is painfully slow.
Specifically this is slow, but it works:
final = df.apply(lambda x: x.append(my_series), axis=1)
Which yields the correct final shape:
Out[49]: (163008, 50)
Where, df.shape
is Out[48]: (163008, 24)
and my_series.shape
is Out[47]: (26,)
This method performs fine for smaller dataframes in the <50k rows range, but clearly it is not ideal.
Update: Added Benchmarks For the Solutions Below
Did a few tests using %timeit
with a test dataframe and a test series, with the following sizes:
test_df.shape
Out[18]: (156108, 24)
test_series.shape
Out[20]: (26,)
Where both the data-frame and the series contain a mix of strings, floats, integers, objects, etc.
Accepted Solution Using Numpy:
%timeit test_df.join(pd.DataFrame(np.tile(test_series.values, len(test_df.index)).reshape(-1, len(attributes)), index=test_df.index, columns=test_series.index))
10 loops, best of 3: 220 ms per loop
Using assign:
I keep receiving ValueError: Length of values does not match length of index
with my test series though when I use the simpler series provided it works, not sure what is going on here......
Using Custom Function by @Divakar
%timeit rowwise_concat_df_series(test_df, test_series)
1 loop, best of 3: 424 ms per loop