I am trying to determine what is necessary to write a line profiler for a language, like those available for Python and Matlab.
A naive way to interpret "line profiler" is to assume that one can insert time logging around every line, but the definition of a line is dependent on how a parser handles whitespace, which is only the first problem. It seems that one needs to use the parse tree and insert timings around individual nodes.
Is this conclusion correct? Does a line profiler require the parse tree, and is that all that is needed (beyond time logging)?
Update 1: Offering a bounty on this because the question is still unresolved.
Update 2: Here is a link for a well known Python line profiler in case it is helpful for answering this question. I've not yet been able to make heads or tails of it's behavior relative to parsing. I'm afraid that the code for the Matlab profiler is not accessible.
Also note that one could say that manually decorating the input code would eliminate a need for a parse tree, but that's not an automatic profiler.
Update 3: Although this question is language agnostic, this arose because I am thinking of creating such a tool for R (unless it exists and I haven't found it).
Update 4: Regarding use of a line profiler versus a call stack profiler - this post relating to using a call stack profiler (Rprof()
in this case) exemplifies why it can be painful to work with the call stack rather than directly analyze things via a line profiler.