I am working on a project and I have a .so (shared object) which contains a bunch of functions inside which deal with matrix-matrix addition and matrix-matrix multiplication.I "LD_PRELOAD" this .so file while running a python script that uses TensorFlow. In a nutshell, instead of using the matrix multiplication functions built into TensorFlow's I am using this .so file which has matrix multiply functions (TensorFlow-mkl-enabled uses libmklml_intel.so file for matrix multiply functions). This is the format in which mkl takes matrix multiplication.
Now, the problem at hand is that I need to monitor three things:
- What are the dimensions of matrices that are multiplied?
- How many times is this function called?
- What is the % if time run time the script spends in this function?
I can think of two approaches
Trace where these function calls are being made from the TensorFlow framework and then do changes to get the required information. But I feel that this approach involves modification at multiple places as there's not just one single function which calls cblas. Additionally I'll have to scrape through the TensorFlow Source code which is a huge task. And it may also call for building TF from source again
- But, if we are somehow able to moniter the interaction of the .so file with TF and maybe somehow register all the data sent while calling, maybe with a clever shell script or with a tool that I am unaware of? What are my best options? Would Perf be of any help here
Thanks in Advance Newbie here