Never call pd.concat
inside a for-loop. It leads to quadratic copying: concat
returns a new DataFrame. Space has to be allocated for the new DataFrame, and data from the old DataFrames have to be copied into the new DataFrame.
So with your dataframe having N rows, you would have O (N^2) copies needed to complete the cycle.
Use a list of dictionaries or a list of lists instead of the dataframe to accumulate the results, and outside the for-loop create your dataframe with the list of results. In this way you will save a lot of execution time, pandas is not done for this.
Here's how you could do it:
list_res = []
for index, row in p_data_df.iterrows():
test_df = log_df.loc[row['Mid-C']].to_frame().transpose()
if 'S' not in test_df.columns:
test_df.insert(0, 'S', row.loc['S'])
test_df.insert(1, 'C #', row.loc['C #'])
test_df.insert(2, 'Num', row.loc['Num'])
list_res.append(test_df)
df = pd.concat(list_res, axis=0)
More tips to speed up your code
iterrows is the slowest possible method for iterating the dataframe, since each line must be transformed into a series. If you use itertuples this does not happen. You can use itertuples without changing your code too much but gaining in performance.
There are other methods (vectorization, apply function, Cython...), which would require a slightly wider modification of your code but which would allow you to have a more efficient code. I leave you this link for a little more information.