0

I am using a dataset of 60,000. Which is taking 6.4 seconds to read the xlsx file and then convert it into a CSV. How to reduce the time? My code :

import pandas as pd
import time


def read_xlsx(path):
    df = pd.read_excel(path)
    return df


def convert_to_csv(df):
    df.to_csv('orders_csv_file.csv')




if __name__ == '__main__':
    start = time.clock()
    df = read_xlsx("/home/arima/sublime_workspace/orders.xlsx")
    print(time.clock() - start)

    start = time.clock()
    convert_to_csv(df)
    print(time.clock() - start)

Time taken for reading the excel is high(6 sec), converting it into csv taking(.30) sec.

Sidhartha
  • 988
  • 4
  • 19
  • 39
  • Tough question. Both reading & writing excel files is slow. That's for a reason. .xlsx files are compressed and require decoding. I'm not sure you're going to find a faster solution. – jpp Apr 09 '18 at 11:18
  • reading the excel taking itself taking more time, I have updated the qs – Sidhartha Apr 09 '18 at 11:22
  • Is Python a requirement? I think it's unlikely you'll find a faster solution. – jpp Apr 09 '18 at 11:27
  • I need to speed up the process if it possible with python or pandas – Sidhartha Apr 09 '18 at 11:31
  • In that case, this is a duplicate. If you are not happy with the answer provided in the dup, consider offering a bounty. But I don't think you will find better answers. – jpp Apr 09 '18 at 11:34

0 Answers0