0

I have a Python array, that I want to convert to a Pandas DataFrame. I wrote some code for it, but it is very slow. Any suggestions on how I can make it faster?

This is what the contents of the Python array generally look like:

[[1523937720000, '0.01000000', '0.01000000', '0.01000000', '0.01000000', '2278.69000000', 1523937779999, '22.78690000', 2, '346.19000000', '3.46190000', '0'], [1523937780000, '0.01000000', '0.02500000', '0.01000000', '0.01404000', '838.33000000', 1523937839999, '12.48726080', 6, '100.00000000', '2.50000000', '0']]

I created a DataFrame for it:

pandaresult = pd.DataFrame(columns=['open_time','open','close','high','low','volume','close_time','quote_volume','trades','tbbav','tbqav']).astype({'open_time':'datetime64[ms]','close':'float_', 'high':'float_','low':'float_','volume':'float_','close_time':'datetime64[ms]','quote_volume':'float_','trades':'float_','tbbav':'float_','tbqav':'float_'})

This is the code that fills the DataFrame:

for index, line in enumerate(result):
    pandaresult = pandaresult.append({'open_time':(pd.to_datetime(line[0], unit='ms')),'open':float(line[1]),'close':float(line[2]),'high':float(line[3]),'low':float(line[4]),'volume':float(line[5]),'close_time':(pd.to_datetime(line[6], unit='ms')),'quote_volume':float(line[7]),'trades':float(line[8]),'tbbav':float(line[9]),'tbqav':float(line[10])}, ignore_index=True)

What can I do to make it faster?

Siem
  • 11
  • 2
  • 1
    So those arrays are in a file? Cant you just do `pd.DataFrame(result,columns= pandaresult.columns)` Then use `pd.to_datetime`,`.astype(float)` over the respective dataframe columns? – Bharath M Shetty Apr 24 '18 at 13:05
  • The results are in memory and generated by another function. Sounds like a good idea, I will try it. – Siem Apr 24 '18 at 13:18
  • Have you looked at generating the DataFrame in one call instead of using iteration? See examples here: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html – Liquidgenius Apr 24 '18 at 13:32
  • Thanks, I looked at it, but I don't know how to do it in one call. – Siem Apr 24 '18 at 14:47

0 Answers0