19

Will the strlen() function below get called just once (with the value stored for further comparisons); or is it going to be called every time the comparison is performed?

for (i = 0; i < strlen(word); i++)
{ /* do stuff */ }
Noldorin
  • 144,213
  • 56
  • 264
  • 302
between
  • 215
  • 2
  • 5

6 Answers6

33

That's implementation-dependent. Usually, it gets called every time, but, if the compiler can see that word never changes, and that strlen is a pure function (no side effects), it can lift the call.

See: http://underhanded.xcott.com/?page_id=15 for a well-known example of this being exploited. :-)

C. K. Young
  • 219,335
  • 46
  • 382
  • 435
  • This is the right answer... indeed, it depends how clever the compiler is being. – Noldorin Jan 12 '10 at 14:15
  • In the example provided this was done on a `char *` which means that neither the pointer nor the data pointed to was constant. Is gcc really doing this? It seems incredibly dangerous. – PP. Jan 12 '10 at 14:40
  • @PP: Suppose your `word` is not passed anywhere else inside the loop (or only passed to a function taking `char const *`), and your code is presumed single-threaded, and that there is no aliasing involved (either because the function is unary, or because the pointer is declared `restricted`). In that case, I'd say that it's a pretty safe assumption that the data won't change. – C. K. Young Jan 12 '10 at 14:42
  • The point I'm trying to make is that a good compiler actually looks at the code to see what it does, not simply what types variables have. – C. K. Young Jan 12 '10 at 14:46
  • 3
    @Chris: this function being unary doesn't necessarily help with aliasing, and neither does only passing `word` to functions taking `char const *`. If `word` is non-restrict, then there might be a global `char*` somewhere, that the callee modifies, and that happens to be an alias of `word`. If the callee is in a different translation unit, and isn't somehow marked pure, the compiler almost certainly can't rule this out, although whole-program optimisation might get another shot at it. Basically, if the loop body isn't wholly inlined, then don't hold your breath. – Steve Jessop Jan 12 '10 at 14:51
  • 1
    @Steve: +1 for excellent points. I mentioned unary because I thought that `restrict` arguments are only non-aliased with respect to other arguments, not that no aliases ever exist for said argument---in which case, like you say, you're still stuffed without whole-program analysis if a global pointer is held somewhere. – C. K. Young Jan 12 '10 at 14:59
  • 2
    To be honest, I get a headache every time I try to read the definition of restrict in the C99 standard, and have to watch cartoons for a while to cool my brain down ;-). But yes, I'm pretty sure a restrict pointer must not be modified during its scope "via" any other reference, whether that reference is another parameter or not. So I think restrict allows optimisations even in functions which (for instance) call back to user-defined code through a function pointer, when no amount of static analysis could help. – Steve Jessop Jan 12 '10 at 17:13
  • -1 Unless `word` is declared constant, it is allowed to change inside the loop and thus the `strlen(word)` will have to be executed each time. I prefer to pull this outside of the loop if `word` is constant: `const size_t word_length = strlen(word);` – Thomas Matthews Jan 12 '10 at 18:28
  • 1
    @Thomas Matthews: It *will have* to be executed each time? Please cite your source in the standard that requires this behavior and precludes compilers from performing analysis to optimize out what they determine to be constant expressions. Also, declaring `word` as `const` doesn't actually mean it can't change inside the loop. – jamesdlin Jan 12 '10 at 19:05
  • 1
    @Thomas: As I said in an earlier comment: "a good compiler actually looks at the code to see what it does, not simply what types variables have". As jamesdlin says, compilers have the discretion to do that, and if you see the link in my post, some compilers do exercise it. – C. K. Young Jan 12 '10 at 19:18
  • In the glibc `string.h`, the `strlen` function is marked with `__attribute_pure__`, which tells `gcc` that it is a pure function. – caf Jan 13 '10 at 00:50
11

I'll sometimes code that as ...

for (int i = 0, n = strlen(word); i < n; ++i) { /* do stuff */ }

... so that strlen is only called once (to improve performance).

ChrisW
  • 54,973
  • 13
  • 116
  • 224
9

It will be evaluated for every iteration of the loop (edit: if necessary).

Like Tatu said, if word isn't going to change in length, you could do the strlen call before the for loop. But as Chris said, the compiler may be good enough to realize that word can't change, and eliminate the duplicate calls itself.

But if word can change in length during the loop, then of course you'll need to keep the strlen call in the loop condition.

Kaleb Brasee
  • 51,193
  • 8
  • 108
  • 113
  • 1
    actually, most compilers should optimize this as long as `word` is not changed within the loop body or declared as `volatile`; as always, you can check the (dis-)assembly to see what happens... – Christoph Jan 12 '10 at 14:15
  • Bah, this is far from the full answer. If the compiler is half waqy decent, it really should optimise out the call so it's only evaluated once. – Noldorin Jan 12 '10 at 14:16
  • That's true, a compiler may be smart enough to optimize and avoid a call each time. – Kaleb Brasee Jan 12 '10 at 14:17
  • 2
    @Noldorin: it requires more than a "halfway decent compiler", because of aliasing. If the loop modifies *any* memory in the function via a pointer, or calls *any* code which does so or might do so, then the compiler probably cannot assume that the pointer in question isn't pointing to the same memory as `word`. If `word` is marked with keyword `restrict` in C99, then it can make that assumption, and in some rare cases it might be able to prove that there's no alias. But without seeing the body of the loop, you can't say it's an easy optimisation. – Steve Jessop Jan 12 '10 at 14:35
  • @Noldorin if the compiler is half way decent then it should NOT cache the result of a call to `strlen()`. It is trivial for a human to do this if necessary; making the compiler super smart only encourages super stupid programmers. There's only pain to be gained by such opimisations. – PP. Jan 12 '10 at 14:36
  • 1
    @PP: Wrong - a compiler should always (and usually) does optimise whenever possible. @Steve: I suppose that's true. I'm pretty sure C# makes the optimisiation, with which I'm more familiar, but C would indeed differ in some ways. – Noldorin Jan 12 '10 at 14:41
  • @Noldorin: C# has immutable strings, which helps with lifting things like this out. – ephemient Jan 12 '10 at 15:11
  • 1
    @Steve: That's what I thought -- there's a lot of things to consider in order for the compiler to eliminate the per-iteration evaluation without screwing something up. – Kaleb Brasee Jan 12 '10 at 15:18
2

The number of times strlen(word) is executed depends on:

  1. If word is declared as constant (the data is constant)
  2. Or the compiler can detect that word is not changed.

Take the following example:

char word[256] = "Grow";

for (i = 0; i < strlen(word); ++i)
{
  strcat(word, "*");
}

In this example, the variable word is modified withing the loop:
0) "Grow" -- length == 4
1) "Grow*" -- length == 5
2) "Grow**" -- length == 6

However, the compiler can factor out the strlen call, so it is called once, if the variable word is declared as constant:

void my_function(const char * word)
{
  for (i = 0; i < strlen(word); ++i)
  {
     printf("%d) %s\n", i, word);
  }
  return;
}

The function has declared that the variable word is constant data (actually, a pointer to constant data). Thus the length won't change, so the compiler can only call strlen once.

When in doubt, you can always perform the optimization yourself, which may present more readable code in this case.

Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154
  • 4
    A `const`-qualified pointer is only a promise that the pointee won't be changed via that specific variable (without casting), not that the data is itself immutable. – jamesdlin Jan 12 '10 at 19:08
  • indeed, what jamesdin says. Usually the case is that if there is any function call there cannot be much of a proof that the data pointed to by a `char *` isn't changed! Or if any memory is modified, because the char pointer could point to that... – Antti Haapala -- Слава Україні Aug 06 '19 at 06:38
0

strlen checks the lenght of the provided string. Which means that if the lenght is 10. Your iteration will keep on going as long as i is below 10.

And in that case. 10 times.

Read more about loops

Filip Ekberg
  • 36,033
  • 20
  • 126
  • 183
  • -1 because the string may be modified within the loop and thus `strlen` could be called an infinite number of times (assuming the compiler isn't caching the result of `strlen()`). – PP. Jan 12 '10 at 14:38
  • Additionally there were no guarantees in the example that `i` wasn't being modified either. – PP. Jan 12 '10 at 14:40
  • That's a stupid reason to down-vote. Assuming that he does not tamper with the string. It was more a very basic explenation of how it would seem on a first look and then a reference how loops work. – Filip Ekberg Jan 12 '10 at 18:40
  • Hence the reference to loop-instructions. – Filip Ekberg Jan 13 '10 at 07:28
0

It will be called for each iteration. The following code only calls strlen function once.

for (i = 0, j = strlen(word); i < j i++)
{ /* do stuff */ }
Camsoft
  • 11,718
  • 19
  • 83
  • 120