I'm part of a small team that is developing a tool that measures the performance of HPC applications. The tool is mostly written in Python, although the run-time collector is a C program.