1

I have read in many documents that itertuples is faster than iterrows while iterating over a dataframe. But while implementing it, itertuples is relatively slower. I wrote a test code for this. Could some one explain

statusMarked = result6[result6.mapping_id_id == row.id]
time1_tup=time.time();
for index,row in statusMarked.iterrows():   
    ap_1=0
time2_tup=time.time()
for row in statusMarked.itertuples():    
    ap_2=0
time3_tup=time.time()
print "row time "
print time2_tup-time1_tup
print "tuple time "
print time3_tup-time2_tup
#iterrows took .00099 seconds but itertuples took .002 seconds
eyllanesc
  • 235,170
  • 19
  • 170
  • 241
Aparna Rajan
  • 153
  • 1
  • 3
  • 13
  • Please link to the documents which say itertuples is faster. – John Zwinck Sep 27 '17 at 11:10
  • @JohnZwinck This is stated in the official documentation of pandas datafarames. I also some answers which emphasize that itertuples is faster in another stack overflow thread. https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iterrows.html https://stackoverflow.com/questions/24870953/does-iterrows-have-performance-issues – Aparna Rajan Oct 08 '17 at 15:56

0 Answers0