-2

I would like to verify that a given string is valid C code in the context of

int main() {
    double x[3];
    <insert code here>;
    return EXIT_SUCCESS;
}

Effectively, I would like the verification to

verifyC('x[0]*x[0] + x[1] + 1') // pass
verifyC('x[0]*x[0] + x[1] +') // fail, syntax error
verifyC('x[0]*x[0] + a') // fail, `a` undefined

What would be a good way of verification?

Nico Schlömer
  • 53,797
  • 27
  • 201
  • 249
  • 2
    Your question seems unclear to me. Should function check given string semantically, or only syntactically? According to your code, the given string will be "pasted" into C code. However, in that case both of them are not correct, because there is no semicolon at the end of each line. So, please, be more specific: try to explain the goal, not the way of achieving. – awesoon May 16 '15 at 10:00
  • write a parser. period. if you face issues in that, we can help. currently, as it is written, it's too broad to answer. – Sourav Ghosh May 16 '15 at 10:02
  • 1
    Use a parser like [pycparser](https://github.com/eliben/pycparser) – myaut May 16 '15 at 10:08
  • 1
    do you know what `^` means in C? :) – Karoly Horvath May 16 '15 at 10:08
  • do you expect a statement or a block of code? what is considered "valid"? – Karoly Horvath May 16 '15 at 10:10
  • You will have to write a C language parser. Or a C language expression parser, at least... But why do you declare a `double x[3]` array? Do you really want just to *verify* a string is a valid expression, or also to *execute* the calculations defined by the expression...? – CiaPan May 16 '15 at 10:11
  • Should `x[3] + 5` be reported valid? Should `double a=1; x[0] += a` be? Should `x[0] = system("rm *")` be? They are all "valid", compilable C! If only simple expressions ought to be checked, look in to an expression parser instead of handing the problem to a full language parser. – Jongware May 16 '15 at 10:56

2 Answers2

2

The simplest way would be to just try to compile a small sample program containing the string you want to check.

This way you get your snippet checked by the real C compiler. This will be far easier and much more reliable than trying to implement all the C parsing and checking in the Python program.

sth
  • 222,467
  • 53
  • 283
  • 367
  • Is there a way to get this done programmatically from the python script? – Nico Schlömer May 16 '15 at 10:36
  • 1
    @NicoSchlömer: Sure, use `import subprocess` and call `gcc`. – pts May 16 '15 at 10:44
  • 1
    @NicoSchlömer: The [`subprocess`](https://docs.python.org/2/library/subprocess.html) module lets you start external programs like the C compiler, and check their exit status. – sth May 16 '15 at 10:46
1

Replace all occurrences of your known variables with a numeric constant. In your code, that would be x[0], x[1], and x[2]. Note that in C lots of intermediate whitespace is allowed, even inside variables: x [ 1 ] is valid. (Also: x[01] is valid. x[0x01] is valid. If the array is larger than 8 elements: x[010] is valid and is actually x[8]. 1[x] is valid and is equal to x[1].)

The numerical constant must in itself be valid, and preferably not equal to 0. (Just to prevent a parser stating 1/x[0] is invalid!)

When replacing, insert a single space before and after your constant. This is to prevent a change of x[1]2 to 12. Do not use parentheses! With those, sin x[1] is invalid but its replacement, sin(1), is.

With this, an input string

x[0]*x[0] + x[1] + 1

is translated into

1 * 1 + 1 + 1

which can be validated with regular procedures; see for example Safe expression parser in Python; or, since you don't need to calculate but only validate, write your own.

Community
  • 1
  • 1
Jongware
  • 22,200
  • 8
  • 54
  • 100