5

When I am using cProfiler, I get the following line:

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     39   12.486    0.320   12.486    0.320 {method 'acquire' of 'thread.lock' objects}

I understood that yappi is the way to go.

So I am writing:

yappi.get_func_stats().print_all()

and I get too many lines to read.

How can I retrieve only the 10 most ones that consume the most time?

Equivalent to:

p.sort_stats('time').print_stats(10)

I basically want to know what consumes the most amount of time.

I do run threads in my code with ThreadPoolExecutor

Community
  • 1
  • 1
Dejell
  • 13,947
  • 40
  • 146
  • 229
  • Check [*this*](http://stackoverflow.com/a/4299378/23771), or you can also do it with GDB. – Mike Dunlavey Feb 14 '17 at 14:28
  • Hi Mike, can you please explain more? I am not sure how it helps my problem. It's like thousands of lines to explore. I know the code very well – Dejell Feb 14 '17 at 14:41
  • It could be billions of lines. The bigger it is, the better the hunting. It's great that you know it very well, because you will see it doing something on two or more samples that could be done better. That's your (first) speedup. Fix it and get the speedup, the size of which you don't know in advance, but follows an [*inverse beta distribution*](http://scicomp.stackexchange.com/a/2719/1262). The fewer samples it takes to find it, the bigger the speedup. Rinse, repeat. It finds a superset of the speedups profilers find. – Mike Dunlavey Feb 14 '17 at 15:08

1 Answers1

2

You can only modify the sorting if you want to limit the result you'll have to modify the print_all method

For sorting stats

import sys
from yappi import get_func_stats, COLUMNS_FUNCSTATS, COLUMNS_THREADSTATS
# Stats sorted by total time
stats = get_func_stats.sort(
        sort_type='totaltime', sort_order='desc') 
# returns all stats with sorting applied 
print_all(stats, sys.stdout, limit=10)

Modified print

import os
def print_all(stats, out, limit=None):
    if stats.empty():
        return
    sizes = [36, 5, 8, 8, 8]
    columns = dict(zip(range(len(COLUMNS_FUNCSTATS)), zip(COLUMNS_FUNCSTATS, sizes)))
    show_stats = stats
    if limit:
        show_stats = stats[:limit]
    out.write(os.linesep)
    # write out the headers for the func_stats
    # write out stats with exclusions applied.
    # for stat in show_stats:
    #    stat._print(out, columns)  
jackotonye
  • 3,537
  • 23
  • 31