I have a C file. With a C style set of comments /* */ followed by a variable defined for each comment. The variable name is also in the comment. Some comments contain variable names they are not for (see the 3rd comment in the below example)
Here's an example of the format:
/* Object: function1: Does some really cool things and then it ends */
const function1 = someValue;
/* Object: function2: Does more really cool things and then it ends */
const function2 = someValue2;
/* Object: function3: Does even more really cool things
just like function2, does but continues over to the next line for a multiline comment */
const function3 = someValue3;
/* Object: function4: Does all kinds of cool things
and needs function1 in order to set a value correctly */
const function4 = someValue4;
/* Object: function5: Does some other cool things
and needs function2[with another variable] to do some things */
const function5 = someBValue5;
I only want to match the variable names with a result like this: function1 function2 function3 function4 function5
I've been playing around with this on https://regexr.com/ for hours and I cannot get this one.
This is what I have tried: regex to find a string, excluding comments With this post its using a negative lookbehind. I cannot use a negative lookbehind because this regex is being used in Perl 5.32.1 on a Windows 10 machine.
This is the best I could come up with:
(\bfunction[\w]+\b[^:,])
Were it excludes line matches with : or , but it doesn't exclude duplicates that are enclosed inside the /* */. But I haven't been able to figure it out other than using a negative lookbehind which I cannot use.
Ultimately, I think the best solution would be to exclude everything in between /* */ and only search for things that are not contained with the comments. But it would need to support exclusion of multline comment content and not use a negative lookbehind.
This is not not a complete answer to my problem becuase it doesn't omit the const & space in front of the function name. The function1, function 2 etc are just generic function names. They would be alphanumeric so believe function[\w]+ still provides the best capture for the function names.