We are having a multithreaded application which has heavy packet processing across multiple pipeline stages. The application is in C under Linux.
The entire application works fine and has no memory leaks or thread saftey issues. However, in order to analyse the application, how can we profile and analyse the threads?
In particular here is what we are interested in:
- the resource usage done by each thread
- frequency and timing with which threads were having contentions to acquire locks
- Amount of overheads due to synchronization
- any bottlenecks in the system
- what is the best system throughput we can get
What are the best techniques and tools available for the same?