1

I need to parse log and I`ve good working regex, but now I need to set regex from config file and here is problem.

int logParser()
{
  std::string bd_regex; // this reads from config in other part of program
  boost::regex parsReg;
  //("(C:.tmp.bd.*?)+(([a-zA-Z0-9_]+\\.)+[a-zA-Z]{2,4})+(.+[a-zA-Z0-9_])");
  try
  {
    parsReg.assign(bd_regex, boost::regex_constants::icase);  
  }
  catch (boost::regex_error& e)
  {
    cout << bd_regex << " is not a valid regular expression: \""
         << e.what() << "\"" << endl;
  }

  cout << parsReg << endl;
  // here it looks exactly like:
  // "("(C:.tmp.bd.*?)+(([a-zA-Z0-9_]+\\.)+[a-zA-Z]{2,4})+(.+[a-zA-Z0-9_])");"

  int count=0;
  ifstream in;

  in.open(bd_log_path.c_str());

  while (!in.eof()) 
  {
    in.getline(buf, BUFSIZE-1);
    std::string s = buf;
    boost::smatch m;

    if (boost::regex_search(s, m, parsReg)) // it doesn't obey this "if"
    {
      std::string name, diagnosis;
      name.assign(m[2]);
      diagnosis.assign(m[4]);

      strcpy(bd_scan_results[count].file_name, name.c_str());
      strcpy(bd_scan_results[count].out,  diagnosis.c_str());
      strcat(bd_scan_results[count].out,  " ");

      count++;
      } 
    }
  return count;
}

and I really dont know why the same regex dont work when I tryed to set it from config variable.

Any help will be appreciated (:

Nobody
  • 11
  • 3

2 Answers2

0

On your direct question: Try storing the regex without escapes in the config file

(C:.tmp.bd.*?)+(([a-zA-Z0-9_]+\.)+[a-zA-Z]{2,4})+(.+[a-zA-Z0-9_])

Besides, I must say, that it looks like you wanted to match backslashes here:

C:.tmp.bd.

In the config, write:

C:\\tmp\\bd\\

In a C++ string literal that would be

"C:\\\\tmp\\\\bd\\\\"

sehe
  • 374,641
  • 47
  • 450
  • 633
  • Yes, this work too, but I dont think that problem is here coz: `parsReg.assign("(C:\\\\tmp\\\\bd\\\\*?)+(([a-zA-Z0-9_]+\.)+[a-zA-Z]{2,4})+(.+[a-zA-Z0-9_])", boost::regex_constants::icase);` - works, but `parsReg.assign(bd_regex, boost::regex_constants::icase);` dont ( – Nobody Nov 17 '11 at 23:49
  • You *did* notice that the first line of code was a different comment than the rest of my answer, did you? – sehe Nov 17 '11 at 23:57
  • it sounds stupid (and sorry for my english), but what 'escapes' exactly means? – Nobody Nov 18 '11 at 00:24
  • @Nobody: You are confusing a *string* as an abstract piece of textual data with a *string literal* in your program source code. Think about this for a while. And don't be ashamed, it took [another guy](http://stackoverflow.com/questions/8144886/unicodestring-w-string-literals-vs-hex-values) two days for the penny to drop. – Kerrek SB Nov 18 '11 at 00:30
0

@sehe gives the correct answer.

If this line of code were parsed by the c++ parser,
str = "(C:.tmp.bd.*?)+(([a-zA-Z0-9_]+\\.)+[a-zA-Z]{2,4})+(.+[a-zA-Z0-9_])";

it would unescape the escape character \\ into just an escape: \, then
asign it to variable 'str'. Inside of the variable 'str', it now looks like this:
(C:.tmp.bd.*?)+(([a-zA-Z0-9_]+\.)+[a-zA-Z]{2,4})+(.+[a-zA-Z0-9_])

But, you are reading this text from a file, there is no parsing in a language sense.
You are asigning to 'str', a raw line of text. A line that is not pre-processed by the c++ parser.