1

I have a project which has more than 1000 source c++ files storing in different folders under the same directory. I want to add log code to the beginning and ending of all functions defined in these .cpp files. See the example below:

    //SomeSrcFile.cpp      
    //Sample  
    ReturnType SomeClass::SomeFunc1(InputParameters)        
    {
         ......
    }

    ReturnType SomeClass::SomeFunc2(InputParameters) {
         ......
    }

output

    ReturnType SomeClass::SomeFunc1(InputParameters)     
    {
         char FuncName[] = "SomeClass::SomeFunc1()";      //line added 
         printf("%s begins\n", FuncName);      //line added 

         ......

         printf("%s ends\n", FuncName);        //line added 
    } 

    ReturnType SomeClass::SomeFunc2(InputParameters) {
         char FuncName[] = "SomeClass::SomeFunc2()";      //line added 
         printf("%s begins\n", FuncName);      //line added 

         ......

         printf("%s ends\n", FuncName);        //line added
    }      

How to write the shell script for this kind of work? Can awk be used here?

UPDATE

  1. I think it is possible to do it by AWK like This Post does. But I haven't figured out how to do it in my way.

  2. I only want to use either Bash or Python for doing this, because they are only tools I am familiar with now.

UPDATE 2
Perhaps this work is really hard that beyond my expectation. But what if I don't care about the accuracy? What if I don't care the cases that Functions are included in comments and the like? Isn't there a simple way to do this?

Community
  • 1
  • 1
Wallace
  • 561
  • 2
  • 21
  • 54
  • May I ask what you're trying to accomplish? There's probably something more useful than log statements. – Dark Falcon Dec 07 '14 at 02:06
  • @DarkFalcon I want to see the code flow when executing some operation as well as for debugging purpose. Of course, this is just an example, I can insert other statements there but the first thing is to know how to do that :_) – Wallace Dec 07 '14 at 02:17
  • Had you considered using `const char *FuncName` instead? I don't think you'll need to modify the string or address it by character. – Borodin Dec 07 '14 at 05:53
  • 3
    Such a program would in general need to parse syntax also (and that is not straight-forward to write).. For example, the following string definition `char greeting[] = "void SomeFunc1";` could confuse a simple parser to believe that it was the start of a real function.. Also comments could contain strings that would confuse the parser, so it would need to parse comments also... – Håkon Hægland Dec 07 '14 at 09:22
  • In general, I think this question is too broad.. – Håkon Hægland Dec 07 '14 at 09:31
  • 1
    @HåkonHægland is right, you cannot do this job reliably without a C parser. How would a shell script know when it sees `void SomeFunc2(...) {` whether or on it was inside C-style comment delimiters, for example? – Ed Morton Dec 07 '14 at 15:11
  • 1
    Debugging print statements like this are a bad idea. But if you're going to do it, consider something like `PRINT("%s begins\n", __func__)` where you use the preprocessor to get the function name via `__func__` and you define the macro PRINT which you can undefine to remove the printing statements. – William Pursell Dec 07 '14 at 15:32
  • @WilliamPursell thanks for the suggestion, regarding the print statemets. – Wallace Dec 09 '14 at 11:10
  • @WilliamPursell is right but make the macro names something specific like `#define ENTER(func) printf("%s begins\n", func)` so you can then make it `#define ENTER(func) if (prtOnEnter) printf("%s begins\n", func)` and that way you just have to set a global variable named `prtOnEnter` to turn your `ENTER()` debugging statements on/off across all your functions. – Ed Morton Dec 09 '14 at 17:37
  • 1
    Instead of modifying your code, what about [`valgrind --tool=callgrind ...`](http://valgrind.org/docs/manual/cl-manual.html). Though I haven't used them, there appear to be visualizer tools for the [callgrind output format](http://valgrind.org/docs/manual/cl-format.html) to make it's dense format easier to comprehend. [KCachegrind](http://kcachegrind.sourceforge.net/html/Home.html) shows up first in a quick search. – n0741337 Dec 09 '14 at 21:59
  • I agree with @WilliamPursell. I have posted an answer below that takes such an approach to the next level. However, if you can get the valgrind-based approach working, it might fit your needs and timeline better and will require no code changes. – arr_sea Dec 09 '14 at 23:18
  • If you don't need to see the entire flow, but just want to know how you got to certain places, you can add code in strategic locations to print stack traces. If you think that would be useful, see this: http://stackoverflow.com/questions/77005/how-to-generate-a-stacktrace-when-my-gcc-c-app-crashes (Of course, you could also just attach a debugger and put a breakpoint in the place you want a stack trace for.) – arr_sea Dec 09 '14 at 23:22
  • Did you try my code, any updates? – BMW Dec 14 '14 at 05:22
  • In my experience, having a printout inside every function will create far too much verbosity because some functions might get called thousands of times in a small loop. I've found it very useful to be able to disable printouts below any particular point in the call stack. (See `TRACE_OFF__` in my answer below, for an example.) – arr_sea Dec 15 '14 at 23:59

3 Answers3

1

refer the post you paste, here is the code:

awk 'BEGIN{X=FS}
    { if ($0~/void/ && $0 ~/\(/) split($0,a,FS);split(a[2],b,"(")
      FS="";OFS="";
         for (i=1; i<=NF; i++)
             if ($i == "{") {
                 if (++d == "1") $i=sprintf("{\n\tchar FuncName[] = \"%s()\";\n\tprintf(\"%%s begins\\n\", FuncName);\n",b[1]);
             } else {if ($i == "}") {
                 if (d-- == "1") $i=sprintf("\n\tprintf(\"%%s end\\n\", FuncName);\n\t} ",b[1]);
               }
             } 
       FS=X;OFS=X
     }1' infile.cpp

Notes:

Keyword is void, if the function defined with other keywords you can adjust from

$0~/void/

to

$0~/(void/int/string)/
BMW
  • 42,880
  • 12
  • 99
  • 116
  • This will treat a comment line like `// cats avoid dogs (and wolves)` as if it were a function definition and the approach of identifying keywords for function return types can never work for general code since functions can return user-defined types. Also you don't need to say `$0~/void/`, just `/void/` is the same thing. – Ed Morton Dec 15 '14 at 13:37
  • thanks for the suggestion, since you post the complex one, I leave my one for others, maybe it is useful in some stages. – BMW Dec 15 '14 at 20:47
1

If you don't mind it not being robust, this will do what you want for simple consistent cases using GNU awk for the 3rd arg to match() and abbreviations for character classes (e.g. \w):

$ cat tst.awk
BEGIN {
    beg = "\tchar FuncName[] = \"%s()\";\n\tprintf(\"%%s begins\\n\", FuncName);\n"
    end = "\n\tprintf(\"%%s ends\\n\", FuncName);"
}
match($0,/^\s*\w+\s+(\w+::\w+)[(][^)]*[)]/,arr) { funcName  = arr[1] }
/{/ && (++braceCnt == 1) { $0 = $0 ORS sprintf(beg,funcName) }
/}/ && (--braceCnt == 0) { $0 = sprintf(end,funcName) ORS $0 }
{ print }

.

$ awk -f tst.awk file
    //SomeSrcFile.cpp
    //Sample
    ReturnType SomeClass::SomeFunc1(InputParameters)
    {
        char FuncName[] = "SomeClass::SomeFunc1()";
        printf("%s begins\n", FuncName);

         ......

        printf("%s ends\n", FuncName);
    }

    ReturnType SomeClass::SomeFunc2(InputParameters) {
        char FuncName[] = "SomeClass::SomeFunc2()";
        printf("%s begins\n", FuncName);

         ......

        printf("%s ends\n", FuncName);
    }

With other awks just use [[:space:]] instead of \s and [[:alnum:]_] instead of \w in the regexp and use a combination of match() with substr() and/or sub()s to extract the function name from the string that matches the regexp, e.g.:

$ cat tst2.awk
BEGIN {
    beg = "\tchar FuncName[] = \"%s()\";\n\tprintf(\"%%s begins\\n\", FuncName);\n"
    end = "\n\tprintf(\"%%s ends\\n\", FuncName);"
}
/^[[:space:]]*[[:alnum:]_]+[[:space:]]+([[:alnum:]_]+::[[:alnum:]_]+)[(][^)]*[)]/ {
    funcName = $0
    gsub(/^[[:space:]]*[[:alnum:]_]+[[:space:]]+|[(][^)]*[)].*/,"",funcName)
}
/{/ && (++braceCnt == 1) { $0 = $0 ORS sprintf(beg,funcName) }
/}/ && (--braceCnt == 0) { $0 = sprintf(end,funcName) ORS $0 }
{ print }
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
0

I suggest a little different approach within your C++.

This will only simplify your shell script needs slightly, but will result in much better C++ code.

In short, I suggest creating a Trace class and a set of macros to deploy it. This has a lot of advantages:

  • Easy to add, remove, change, printouts.
  • Automatically takes care of filename, function name, line number, and much more.
  • Works no matter where the function exits (this is a big improvement over your approach)

Create a Trace class which accepts arguments such as function name, file name, line #, etc. - whatever you want to be able to print out. Whenever a Trace class is constructed it will print a function entry message. Whenever a Trace class goes out of scope and is destructed it will print a function exit message.

So here's the class declaration:

// File: Trace.h

class Trace                                                                                                                  
{
public:
   // print function entry message
   Trace(const char *name,                                                                                                   
         const char *file,                                                                                                   
         const int line, 
         int traceOn = -1, // used for enabling/disabling Trace printing 
                           // within a particular call chain.
         const void* thisPtr = 0);                                                                                           

   // print function exit message
   ~Trace();                                                                                                                 

   static bool active;                                                                                                       

private:
   std::string* theFunctionName;                                                                                             
   std::string* theFileName;
   int          theLineNumber;                                                                                               
   const void*  theThisPtr;                                                                                                  
   static int   layer; // used for printout indenting
   static std::list<clock_t> startTimeList; // used for printing timing information                                                                                 
   bool         previouslyActive;                                                                                            
};

Now let's look at the corresponding macros (these also go in Trace.h):

#ifdef TRACE_ENABLED // define the real macros
#undef TRACE_ENABLED // must be re #defined preceding each '#include "Trace.h" '    
#define TRACE__     Trace tr_( __FUNCTION__ , __FILE__ , __LINE__, -1, this );                                             
#define TRACE_OFF__ Trace tr_off_(__FUNCTION__ , __FILE__ , __LINE__, false);                                              
#define TRACE_ON__  Trace tr_on_(  __FUNCTION__ , __FILE__ , __LINE__, true);                                              

#else // DUMMY MACROS: if TRACE_ENABLED is not defined.                                                                      

#define TRACE__
#define TRACE_ON__
#define TRACE_OFF__                                                                                                        

#endif        

Now, to use the Trace class, you would do the following:

//SomeSrcFile.cpp      
//Sample  

#define TRACE_ENABLED // comment this out to disable tracing in this file.
#include "Trace.h"

ReturnType SomeClass::SomeFunc1(InputParameters)        
{
     TRACE__
     ......
}

ReturnType SomeClass::SomeFunc2(InputParameters) {
     TRACE__
     ......
}

Now, a potential implementation of the Trace class: // File: Trace.C

#include "Trace.h"

bool Trace::active = true;
int  Trace::layer = 0;
std::list<clock_t> Trace::startTimeList;

Trace::Trace(const char *name, 
             const char *file, 
             const int line, 
             int traceOn,  // default: -1
             const void* thisPtr) // default:  0 
: theFunctionName(0),
  theFileName(0),
  theLineNumber(0),
  theThisPtr(thisPtr),
  previouslyActive(active)
{
   if (active)
   {
      theFunctionName = new std::string(name);
      theFileName     = new std::string(file);
      theLineNumber   = line;

      std::cout<<"---> Trace: ";
      for( int L = 0; L < layer; ++L )
      {
         std::cout<<"   ";
      }
      std::cout
         <<"{ Entered "
         << *theFileName << ":" 
         << theLineNumber << " " 
         << *theFunctionName << "()";

      if(thisPtr > 0)
      {
         std::cout<<"  this = "<< thisPtr;
      }
      std::cout<< std::endl;

      layer++;
      startTimeList.push_back(clock());
   }

   if(traceOn >= 0)
   {
      active = traceOn;
      if (active && !previouslyActive)
      {
         std::cout<<"+++] Trace: Enabled"<<std::endl;
      }
      else if (previouslyActive && !active)
      {
         std::cout<<"+++[ Trace: Disabled"<<std::endl;
      }
   }
}

Trace::~Trace()
{
   if (previouslyActive && !active)
   {
      std::cout<<"+++] Trace: Enabled"<<std::endl;
   }
   else if (active && !previouslyActive)
   {
      std::cout<<"+++[ Trace: Disabled"<<std::endl;
   }

   active = previouslyActive;

   if (active || theFunctionName)
   {
      layer--;
      double startTime = static_cast<double>(startTimeList.back());
      startTimeList.pop_back();
      double endTime = static_cast<double>(clock());
      double elapsedTime = (endTime-startTime)/CLOCKS_PER_SEC;

      std::cout<<"<--- Trace: ";
      for( int L = 0; L < layer; ++L )
      {
         std::cout<<"   ";
      }

      std::cout<<"} Leaving "
               << *theFileName << ":" 
               << theLineNumber <<" "
               << *theFunctionName <<"()";
      if(theThisPtr > 0)
      {
         std::cout<<"  this = "<< theThisPtr;
      }

      std::cout<<"  elapsedTime = "<<elapsedTime<<"s";
      std::cout<< std::endl;

      delete theFunctionName;
      delete theFileName;
   }
}
arr_sea
  • 841
  • 10
  • 16
  • 1
    Thanks @EdMorton. I removed the leading `__`. I assume you're saying this rule applies to the C preprocessor, right? – arr_sea Dec 10 '14 at 19:43
  • Your answer doesn't address the issue of modifying the 1000 c++ source files in subdirectories. – martineau Dec 13 '14 at 16:12
  • @martineau: Yup, as stated in my "answer": "This will only simplify your shell script needs slightly." So no, it doesn't actually answer the question. I think it re-directs the question to seek a better end result. Normally, I suppose that sort of thing should be a comment instead of an answer, but I certainly couldn't fit all that code into a comment. – arr_sea Dec 15 '14 at 17:40
  • Your answer is mostly just a comment to the effect of "You wouldn't have the problem if your 1000 files of C++ code had been written this way". Like they say, hindsight is often 20/20... – martineau Dec 15 '14 at 17:59
  • @martineau: Actually, my answer is a comment saying: "you're about to try to do something with a awk-like script, and I think you should do something else with it instead." I'm saying that the person asking the question should add a single macro instead of a pair of print statements. This has nothing to do with the existing code. What it does do is make the awk task a little easier (only one word added per function body) and with a more useful end result. – arr_sea Dec 15 '14 at 20:05
  • I suspect you're underestimating the complexity of parsing C++ code in order to determine where to insert things. – martineau Dec 15 '14 at 20:18
  • @martineau Actually, I completely agree with you regarding the complexity of parsing C++, and I would be surprised/impressed if an awk script could actually do the proper trace insertions for all valid C++ code. That's one reason I prefer the valgrind-based tool approach, if possible. The only viable C++ parser I'm aware of is gcc-xml, but I don't know if that could help here. My "answer" is just suggesting that if someone is going to go to the great trouble of successfully editing every function in a large code base, they should do it in a way that yields a more flexible result. – arr_sea Dec 15 '14 at 21:22
  • @martineau: Regarding the challenge of determining where to insert things, hopefully you at least agree that using a trace object's destructor to print function exit is superior to having to determine all possible function exit locations and and inserting the corresponding print statements. – arr_sea Dec 15 '14 at 21:23
  • To clarify: I'm not intending that this "answer" be accepted. I'm only trying to add something helpful to the discussion. (I put "answer" in quotes because it doesn't really answer the question, but rather re-forms it.) – arr_sea Dec 15 '14 at 21:24
  • @arr_sea though your answer is not I want, it is very useful for improving code design. Thanks for that:_) – Wallace Dec 17 '14 at 11:00