I have a dataframe loaded from a CSV in the following format:
stock_code price
20180827 001 10
20180827 002 11
20180827 003 12
20180827 004 13
20180826 001 14
20180826 002 15
20180826 003 11
20180826 004 10
20180826 005 19
I want to transform it to the following format:
001 002 003 004 005
20180827 10 11 12 13 nan
20180826 14 15 11 10 19
This is my function ( oracle_data
is the original data frame) that does the transformation, but it takes 7 minutes for 547500 row dataframe. Is there a way to speed it up?
def transform_data(oracle_data):
data_code = oracle_data[0]
data_date = oracle_data[1]
factor_date = sorted(data_date.unique())
stock_list = sorted(data_code.unique())
factor_data = pd.DataFrame(index = factor_date, columns = stock_list)
sort = oracle_data.sort_index()
for n in oracle_data.index:
factor_data.at[oracle_data.at[n,1],oracle_data.at[n,0]]=oracle_data.at[n,2]
return factor_data