I'm porting a Python metrics library to C++. One API the Python library provides is a function decorator to easily record timing data on functions. By modifying the function definition to be
@timed('timing.foo')
def foo():
...
foo_result = foo()
is essentially turned into
start = time.time()
foo_result = foo()
post_metric_to_endpoint('timing.foo', time.time() - start)
In Function hooking in C++, we find this answer that wraps instances and places the burden of calling the timing function on the caller, which means we won't get performance data across the codebase (it's annoying to update in the first place, and easily forgotten later). Similarly, this answer to Timing in an elegant way in c++ requires altering the call sites. This other answer to that question provides a way of wrapping arbitrary blocks of code, which theoretically means we could indent the entire body of a function we want to time and nest it inside a scope. This is the closest I've found to what I want, but is pretty ugly, fairly intrusive, and I'm not sure of the performance impact.
Since this is meant to be used as a library, we can modify the source of the functions we want to time; in fact, this is preferable, as I'd like every call of that function to be timed. Most discussions seem to focus on ad-hoc profiling in development, whereas I'm trying to build a system that would be always active in a production environment.