I am trying to parse the SQLite sources for error messages and my current approach has most cases covered, I think.
My regex:
(?:sqlite3ErrorMsg|sqlite3MPrintf|sqlite3VdbeError)\([^;\"]+\"([^)]+)\"(?:,|\)|:)
Source snippet (not valid C, only for demonstration):
sqlite3ErrorMsg(pParse, variable);
sqlite3ErrorMsg(pParse, "row value misused");
){
sqlite3ErrorMsg(pParse, "no \"such\" function: %.*s", nId, zId);
pNC->nErr++;
}else if( wrong_num_args ){
sqlite3ErrorMsg(pParse,"wrong number of arguments to function %.*s()",
nId, zId);
pNC->nErr++;
}
if( pExpr->iTable<0 ){
sqlite3ErrorMsg(pParse,
"second argument to likelihood must be a "
"constant between 0.0 and 1.0");
pNC->nErr++;
}
}else if( wrong_num_args ){
sqlite3ErrorMsg(pParse,"factory must return a cursor, not \\w+",
nId);
pNC->nErr++;
This successfully outputs the following capture groups:
row value misused
no \"such\" function: %.*s
second argument to likelihood must be a "
"constant between 0.0 and 1.0
factory must return a cursor, not \\w+
However, it misses wrong number of arguments to function %.*s()
- because of the ()
.
I have also tried to capture from "
to "
with a negative look-behind to allow escaped \"
(as not to skip over no \"such\" function: %.*s
), but I could not get it to work, because my regex-foo is not that strong and there's also the cases of the multiline strings.
I've also tried to combine the answers from Regex for quoted string with escaping quotes with my regex, but that did not work for me, either.
The genereal idea is:
There's a function call with one of the three mentioned function names (sqlite3ErrorMsg|sqlite3MPrintf|sqlite3VdbeError), followed by a non-string parameter that I'm not interested in, followed by at least one parameter that may be either a variable (don't want that) or a string (that's what I'm looking for!), followed by an optional arbitrary number of parameters.
The string that I want may be a multiline-string and may also contain escaped quotes, parenthesis and whatever else is allowed in a C string.
I'm using Python 3.7