1

I am trying to scan the C source code to detect the line of code then get the expressionA and expressionB in the form like this:

(*((_type*)expressionA)) = expressionB;

with the regex:

[*][(][\s]*[(\s]*.*[*][)\s]*(.*)\s*[)]\s*=(.*)[\s]*;

Here is the link with some test cases.

But this regex can not cover cases like this:

(*((volatile unsigned short*)(type01_01_06_base + (type01_01_06_offset * 1))) = (unsigned short)(unsigned long)0x01010101);

(*((volatile unsigned char*)add)) = (unsigned char)data;

Is there any solution to use regex to cover all cases in my problem?

Thuy Nguyen
  • 353
  • 2
  • 10
  • 5
    Regular expressions are not capable of parsing arbitrary programming language constructs. Depending on what else you want to do, looking into a real parser might be the better long-term solution. – GhostCat May 15 '18 at 05:36
  • You may want to read (SO question) [What is a regular language?](https://stackoverflow.com/questions/6718202/what-is-a-regular-language) – Erwin Bolwidt May 15 '18 at 06:02
  • 1
    In other words, C is not a regular language so you can't parse it using regular expressions. In addition, it has a preprocessor. Which is often used and changes the meaning of code code completely. To parse C code you also need to run the preprocessor first. The C preprocessor is a programming language in its own right, and you won't be able to interpret it with regular expressions. – Erwin Bolwidt May 15 '18 at 06:16
  • 1
    Well, probably [lazy quantifiers will help](https://regex101.com/r/eJ3bK3/6) to some extent, but this approach is all flawed. You will be safer with a dedicated parser. – Wiktor Stribiżew May 15 '18 at 06:50
  • The question is, does your program need to work for all valid C code fitting that template? Or is it enough if it passes some predefined test set, and who cares about other inputs? Also, are those guaranteed to be single lines, or can they be split? – hyde May 15 '18 at 07:24
  • Also note that in C, expressions may contain assignments `=`. Is this allowed in your expressions? – hyde May 15 '18 at 07:28
  • Also note that C has comments. Are comments allowed in your expression? Just put `/* ; */` after any of your expressions to confuse your regex badly. – Erwin Bolwidt May 15 '18 at 07:45

0 Answers0