-1

I am new to Python and I wrote a python code to process excel files. Here's my code

files=os.listdir("XXX")
os.chdir("XXX")

def getDF(xl, sh):
    print(sh)
    test= xl.parse(sh)
    test2=test.iloc[:, (list(range(8))+ list(range(8,len(test.columns),5))) + list(range(9,len(test.columns),5))]
    num=list((range(1440)))
    aCN = [str(x)+'w' for x in num]
    bCN = [str(x)+'r' for x in num]
    test2.columns=["a", "b", "c", "d", "e", "f", "g" , 'h']+aCN + bCN
    return(test2)

def prepareOneFile(path):
    fn = path
    xl = pd.ExcelFile(fn)
    newDF=[getDF(xl, x ) for x in xl.sheet_names]
    df = pd.concat(newDF)
    print(fn)
    return(df)


app_list= [prepareOneFile(x) for x in files]

The code runs very slow, I can I speed it up? Many thanks11

user2146141
  • 155
  • 1
  • 14
  • We are not for code review. If the code runs correctly, you should have a look at code review.se . But check their FAQ before posting! – too honest for this site Jul 06 '18 at 21:29
  • Unrelated: `return` is not a function. you should not make it look like one. And use a consistant formatting style, e.g. spaces around operators, etc. There is a coding style recommendation on the Python homepage (easy to find by e.g. google) which should be (mostly) followed. – too honest for this site Jul 06 '18 at 21:44

1 Answers1

0

Your code isn't easy to read, but I don't think you can do much in terms of efficiency. The fn = path is unnecessary if you change the following line to xl = pd.ExcelFile(path), that saves you one extra step. You could delete the prints, they take up a (very small) amount of time.

Other than that you could have a look at using a VBA script to convert your Excel to CSV files. pd.read_csv is faster than reading from excel. I suppose this is your best option. That's all I can give you. Maybe someone wiser will come along and give you a better answer.

Wald
  • 301
  • 1
  • 12
  • Answering "use a different question/tool" does not really count as an answer here. Additionally the question is off-topic on SO anyway and should not be answered at all, – too honest for this site Jul 06 '18 at 21:32
  • @Olaf I was going to comment and point the link out, but then it said "Comments are used to ask for clarification or to point out problems in the post", and that didn't really fit my writing so I answered instead. Would a comment have been more appropriate? Also, why shouldn't I answer? – Wald Jul 06 '18 at 21:38
  • 1) A comment like the one I left would have been appropriate. Also you could have flagged the question, you have the privilege to, use it. 2) Off-topic questions are not to be answered. For more information, please re-take the [tour], read [ask] to see which questions are off-topic and for particular problems, meta should answer most of them. If not, feel free to post your own (but pad it well, because it's meta). – too honest for this site Jul 06 '18 at 21:41
  • @Olaf thanks for pointing out the flag option, I hadn't realised the use yet. I do have another question regarding the question being off-topic though. I hope you don't mind, I am quite new. So reading through the what to ask, I haven't found that you can't ask about efficiency. And looking around I found quite a few (partly very popular) questions in that field. So the reason this question is off-topic is because it was badly asked (in that a google search would have shown results)? – Wald Jul 06 '18 at 21:58
  • The links I posted were just a atarter. They are a brief intro to writing good questions. I also pointed you at meta, which does provide a lot more details if you are willing to dig a bit. This question is apparently "too broad", as it does not provide a **specific** problem, but would require a detailed analysis of what OP considers "too slow" (for a starter). In general, questions should be of interest for future users, OP getting an answer is in fact more a bonus for posting the question. This one is not. Hence my advise to check code review, SO's sibling site. – too honest for this site Jul 06 '18 at 22:03