4

Let's say I input a very long equation on to a single line of c code (either a .c or .h file) that is thousands (perhaps tens of thousands) of characters long; for example

y = (2*(36*pow(x,2)*pow(A[n][j],5)*B[n][j]
  + (several thousand more such expressions) ) ;

(here just take x to be a variable, A, B to be double pointers, etc). Is there a limit for how long a line of code can be in a .c or .h file before say the gcc compiler is unable to correctly compile the code? I've read several related discussions about this issue for #c, but not for just plain c. I have never received any errors from gcc about having too long of lines in my code, but I'd like to be extra sure about this point.

EDIT: In response to some of the below comments, I now realize that I was asking two (I think closely related) questions:

(1) Is there any limit to how long a line can be in c before the gcc compiler could potentially make an error/raise an error?

(2) Is there any limit to how complex an expression can be before the gcc compiler could potentially make an error/raise an error? (e.g. we could break up a very long line into several lines but it's all a part of the same expression).

physics_researcher
  • 638
  • 2
  • 9
  • 22
  • 6
    You could try it. But I'm not sure if having such long expressions is something you want to do. – Jabberwocky Oct 09 '17 at 17:25
  • 2
    Try it and C. I mean see. Write a long equation. Then copy it and add it to itself. Then copy that and add it to itself. Then copy all that and add it to itself... Of course, it doesn't matter if gcc can handle it, the result will be utterly unmaintainable and for that reason alone should be broken up into its component parts. – Schwern Oct 09 '17 at 17:26
  • You could use 4095 characters in one line and 127 arguments in one function call. For more information check this answer. https://stackoverflow.com/questions/11614687 – fotisgpap Oct 09 '17 at 17:27
  • 1
    Don't forget that a complicated expression may need a lot of stack space at run time to store intermediate values. – Weather Vane Oct 09 '17 at 17:31
  • 1
    You referred to "a single line of c code", but your example is split across two lines. Are you asking about limits on the length of a line of code, or on the complexity of an expression? – Keith Thompson Oct 09 '17 at 17:32
  • Given most of the entries in [this contest](http://www.ioccc.org/years.html), it appears the C compiler is able to handle quite a lot of abuse. ;-) – Charles Srstka Oct 09 '17 at 17:54
  • 1
    The discussion in [this old question](https://stackoverflow.com/questions/6296837/gcc-compile-error-with-2-gb-of-code) suggests that the first limit you're likely to hit is a limit on how much machine code can be packed into a single object file! – zwol Oct 09 '17 at 18:06
  • The first limit you run into is likely that humans fail to read and understand the code. Then it is not very useful, even if the compiler might understand it. – Bo Persson Oct 09 '17 at 23:34

2 Answers2

6

The actual upper limit for "how long a line of code can be in a .c or .h file " is highly implementation dependent, but the lower limit is specified in standard. According to C11, §5.2.4.1

The implementation shall be able to translate and execute at least one program that contains at least one instance of every one of the following limits:

  • 4095 characters in a logical source line

That said, as Keith mentioned in other answer, the logical line and the complexity of the statement / expression (involving the number of operations / operands involved, type of operations, nested expression etc) are not the same thing. There are separate minimal recommendations too, like

  • 63 nesting levels of parenthesized expressions within a full expression

  • 511 identifiers with block scope declared in one block

etc.

In process of computing a complex expression, multiple intermediate results must be stored temporarily and theoretically, it may use up all available stack space in your system, creating a problem. In practice, that's something really far-fetched unless the expression is so complex that it cannot be accommodated in today's multi-gig computing systems.


With all that said, probably you need to write such code only once, that is never. As said by M. Fowler and I quote,

Any fool can write code that a computer can understand. Good programmers write code that humans can understand.

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
  • 1
    The limit on the number of characters in a logical source line is not relevant to the question, since expressions can be split across as many lines as you like. EDIT: On more careful reading, I see the question does mention line length, but the example splits an expression across two lines. I've asked the OP for clarification. – Keith Thompson Oct 09 '17 at 17:31
  • @KeithThompson Absolutely!! I added a note, but you already covered in great detail. – Sourav Ghosh Oct 09 '17 at 17:52
5

You've asked about two separate things: the maximum length of a line, and the complexity of an expression. An arbitrarily complex expression can easily be split across multiple lines -- as you did in your example.

The C standard requires implementations to support at least 4095 characters in a logical source line. The way it expresses that requirement is rather indirect. A compiler must be able to process one program that hits all the specified limits. The rationale is that the standard specifies the requirement in a precise and testable way, but the easiest way to meet the requirement is to avoid imposing any fixed limits at all.

The details are in N1570 5.2.4.1, "Translation limits". The relevant limits in that section are 63 nesting levels of parentheses and 127 arguments in a function call -- but you can create an arbitrarily complex expression without hitting either of those limits.

The standard imposes no specific limits on the complexity of an expression. Most compilers, including gcc, will allocate resources (particularly memory) dynamically as they're processing source code. The internal representation of an expression is likely to be a dynamically allocated tree structure, not a fixed-size array.

You can probably construct an expression that's too complex for gcc to handle, and it will probably respond either by printing a fatal error message when it's unable to allocate memory, or just by choking with a segmentation fault or something similar. On a modern computer with gigabytes of memory, you'd need a very large expression to trigger such a failure.

You're not going to run into this issue unless you're generating C code automatically, and your generator gets out of hand.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631