Understanding python cProfile output

Question

My python scripts parses files sequentially, and makes simple data cleaning and writes to a new csv file. I'm using csv. the script is taking awfully long time to run.

cProfile output is as follows: enter image description here

I have done a lot of googling before posting the question here.

link to the image image link

Adding code here, the function which is called

def db_insert(coCode, bse):
start = time()
q = []
print os.path.join(FILE_PATH, str(bse)+"_clean.csv");
f1 = open(os.path.join(FILE_PATH, str(bse)+"_clean.csv"))
reader = csv.reader(f1)
reader.next()
end = time()
# print end-start
for idx,row in enumerate(reader):
    ohlc = {}
    date = datetime.strptime( row[0], '%Y-%m-%d')
    date = row[0]
    row  = row[1:6]
    (op, high, low, close, volume) = row
    ohlc[date] = {}
    ohlc[date]['open'] = op
    ohlc[date]['high'] = high
    ohlc[date]['low'] = low
    ohlc[date]['close'] = close
    ohlc[date]['volume'] = volume
    q.append(ohlc)
end1 = time()
# print end1-end

db.quotes.insert({'bse':str(bse), 'quotes':q})
# print time()-end1
f1.close()
q = []
print os.path.join(FILE_PATH, str(coCode)+".csv");
f2 = open(os.path.join(FILE_PATH, str(bse)+"_clean.csv"))
reader = csv.reader(f2)
reader.next()
for idx,row in enumerate(reader):
    ohlc = {}
    date = datetime.strptime( row[0], '%Y-%m-%d')
    date = row[0]
    try:
        extra = row[7]+row[8]+row[9]
    except:
        try:
            extra = row[7]
        except:
            extra = ''
    row  = row[1:6]
    (op, high, low, close, volume) = row
    ohlc[date] = {}
    ohlc[date]['open'] = op
    ohlc[date]['high'] = high
    ohlc[date]['low'] = low
    ohlc[date]['close'] = close
    ohlc[date]['volume'] = volume
    ohlc[date]['extra'] = extra
    q.append(ohlc)
db.quotes_unadjusted.insert({'bse':str(bse), 'quotes':q})
f2.close()

Why post an image at all if you can just copy the text? As for why your script takes so long to run, that's a difficult question to answer if we don't see the script/ — Martin Tournoij, Dec 02 '14 at 22:48

score 8 · Answer 1 · edited May 23 '17 at 11:58

I found this explanation in an answer by John Machin.

ncalls is relevant only to the extent that comparing the numbers against other counts such as number of chars/fields/lines in a file may highligh anomalies; tottime and cumtime is what really matters. cumtime is the time spent in the function/method including the time spent in the functions/methods that it calls; tottime is the time spent in the function/method excluding the time spent in the functions/methods that it calls.

Understanding python cProfile output

1 Answers1