113

I'm not sure if the following code can cause redundant calculations, or is it compiler-specific?

for (int i = 0; i < strlen(ss); ++i)
{
    // blabla
}

Will strlen() be calculated every time when i increases?

Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
daisy
  • 22,498
  • 29
  • 129
  • 265
  • 14
    I'm going to guess that without a sophisticated optimization that can detect that 'ss' never changes in the loop, then yes. Best to compile, and look at the assembly to see. – MerickOWA Jul 06 '12 at 15:21
  • 6
    It depends on the compiler, on the optimisation level and on what you (might) do to `ss` inside the loop. – Hristo Iliev Jul 06 '12 at 15:21
  • 5
    If the compiler can prove that `ss` is never modified, it can hoist the computation out of the loop. – Daniel Fischer Jul 06 '12 at 15:22
  • @DanielFischer: True; but that would require compile-time analysis of exactly what `strlen` does, and would only work if it could prove that the pointer couldn't be aliased. In practice, I'd be rather surprised to see that optimisation. – Mike Seymour Jul 06 '12 at 15:30
  • 10
    @Mike: "require compile-time analysis of exactly what strlen does" - strlen is probably an intrinsic, in which case the optimizer knows what it does. – Steve Jessop Jul 06 '12 at 15:33
  • @SteveJessop: Maybe, maybe not. Personally, if speed were an issue, I'd see what the optimiser actually did rather than speculating about what it might do. – Mike Seymour Jul 06 '12 at 15:34
  • 1
    @SteveJessop, Mike: This depends on the implementation. In gcc, the `strlen` is tagged as `__attribute_pure__` identifying that whatever the implementation is, the side effects of the function are only the returned value and that value depends only on the arguments (and possibly globals). By analyzing the loop the compiler can then infer that the function does not need to be called multiple times. – David Rodríguez - dribeas Jul 06 '12 at 15:39
  • @DavidRodríguez-dribeas: shouldn't "and possibly globals" be "and possibly const globals"? Obviously if the return value of `strlen` can depend on mutable globals then the optimizer is stuck. – Steve Jessop Jul 06 '12 at 15:41
  • 3
    @MikeSeymour: There is no maybe, maybe not. strlen is defined by the C language standard, and its name is reserved for the use defined by the language, so a program is not free to supply a different definition. The compiler and optimizer are entitled to assume strlen depends solely on its input and does not modify it or any global state. The challenge to optimization here is determining that the memory pointed to by ss is not altered by any code inside the loop. That is entirely feasible with current compilers, depending on the specific code. – Eric Postpischil Jul 06 '12 at 15:41
  • @SteveJessop: That was interpreted from the GCC documentation. I have rechecked and no, it does not say *constant globals*, but just *globals*. [GCC Attributes](http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html). There is a stricter attribute `const` that is `pure` with no dependency on globals, but the `strlen` is just tagged as `pure` --no idea why, as it seems that `const` would be applicable here (i.e. I don't see any need for `strlen` to check globals anywhere!) – David Rodríguez - dribeas Jul 06 '12 at 15:45
  • 1
    @EricPostpischil: My "maybe" referred to whether the optimiser knows what it does, which is certainly not guaranteed. But this argument is rather pointless: if speed is important, then measure and observe what the optimiser *actually* does; if it's not, then don't worry about it. – Mike Seymour Jul 06 '12 at 15:49
  • 1
    @dribeas: The reason `strlen` is not `const` is because the contents of the string is considered a global. If it were `const`, then it would only be able to examine the string pointer, and not the memory it points to. The `const` attribute is for functions like `sqrt`. – Dietrich Epp Jul 06 '12 at 17:39
  • I think it's unlikely any optimising logic is going to know exactly what *strlen(ss)* actually does, so it's probably irrelevant whether the optimiser can see that "ss" can't be changed by code within the loop. For all the optimiser knows, strlen(ss) might return either 0 or 99, depending on whether "ss" happens to be a valid string representation of the current system time. – FumbleFingers Jul 06 '12 at 22:09
  • 1
    ...for what it's worth, back in the days when processors were puny, I always used to code such loops as **for (int i=strlen(ss); i--;)** – FumbleFingers Jul 06 '12 at 22:13
  • @FumbleFingers: That's demonstrably false. For example, if you look at the assembly output for code which calls `strlen("Hello")`, you'll often see the number 5 hard coded, with no call to `strlen`. This goes beyond simple inlining. – Dietrich Epp Jul 14 '12 at 20:03
  • @Dietrich Epp: Since "strlen" is not a reserved word, it's not obvious to me that it would be illegal to replace the standard library function with your own. I actually did have my own "sprintf" function that accepted a null pointer for the o/p buffer. It wrote nothing - just returned the number of bytes it *would* have written if a non-null buffer had been passed. That was back in the days when memory was so limited I needed to know exactly how many bytes to allocate for the buffer. – FumbleFingers Jul 15 '12 at 20:27
  • 1
    @FumbleFingers: You can define your own `strlen`. However, what I said is still true: there do exist compilers (such as GCC and Clang) which optimize out calls to `strlen` based on knowledge of exactly what it does. If your function does something different, you have to pass extra flags to the compiler. See http://gcc.gnu.org/onlinedocs/gcc-4.7.1/gcc/Other-Builtins.html – Dietrich Epp Jul 16 '12 at 04:10
  • strlen() return a size_t and the value is maybe more than INT_MAX, in this case, the for loop is a endless loop. May change i to a size_t? – 12431234123412341234123 Oct 09 '16 at 13:32

18 Answers18

144

Yes, strlen() will be evaluated on each iteration. It's possible that, under ideal circumstances, the optimiser might be able to deduce that the value won't change, but I personally wouldn't rely on that.

I'd do something like

for (int i = 0, n = strlen(ss); i < n; ++i)

or possibly

for (int i = 0; ss[i]; ++i)

as long as the string isn't going to change length during the iteration. If it might, then you'll need to either call strlen() each time, or handle it through more complicated logic.

Mike Seymour
  • 249,747
  • 28
  • 448
  • 644
  • I like the first, the latter looks a bit dangerous to me, in case the 'string' gets shortend more then one byte during iteration. – alk Jul 06 '12 at 15:34
  • 15
    If you know you're not manipulating the string, the second is far more preferable since that's essentially the loop that will be performed by `strlen` anyway. – mlibby Jul 06 '12 at 15:35
  • 26
    @alk: If the string might get shortened, then both of these are wrong. – Mike Seymour Jul 06 '12 at 15:36
  • Ooha yes @MikeSeymour .. - my mind shuffeld the commas and semicolons around as I wished them to be set. – alk Jul 06 '12 at 15:38
  • 4
    @alk: if you're changing the string, a for loop is probably not the best way to iterate over each character. I'd think a while loop is more direct and easier to manage the index counter. – mlibby Jul 06 '12 at 15:40
  • where does n come from?? (ok, got it...) – Gui13 Jul 06 '12 at 15:42
  • 3
    *ideal circumstances* include compiling with GCC under linux, where `strlen` is marked as `__attribute__((pure))` allowing the compiler to elide multiple calls. [GCC Attributes](http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html) – David Rodríguez - dribeas Jul 06 '12 at 15:42
  • @DavidRodríguez-dribeas: OK, maybe I was a bit pessimistic in saying "very unlikely". I still wouldn't rely on it myself. – Mike Seymour Jul 06 '12 at 15:45
  • Is there any reason why the second example here would perform more slowly than the first? I should think any call to strlen is just iterating over the length of the string for no reason when we can perform the length check more directly in the for loop. – mlibby Jul 06 '12 at 15:47
  • @MikeSeymour: I completely agree, where the cost of doing it manually is minimal, there is no need to depend on optimizations. Also VS does not have such marking, so it would be much harder to get that optimization there. – David Rodríguez - dribeas Jul 06 '12 at 15:48
  • @mcl: I'd expect both to have more or less similar performance, probably. The first one might be faster due to optimisations in `strcpy` (e.g. reading several bytes at once); the second might play more nicely with the cache since it isn't reading the whole string right away. You'd have to measure if it were important. – Mike Seymour Jul 06 '12 at 15:52
  • 6
    The second version is the ideal and most idiomatic form. It allows you to pass over the string only once rather than twice, which will have much better performance (especially cache coherency) for long strings. – R.. GitHub STOP HELPING ICE Jul 06 '12 at 17:26
  • @DavidRodríguez-dribeas, if you assign to `ss` or pass it to a function it won't optimize the call, even if you don't change the length. Some kinds of aliasing may break it as well. – edA-qa mort-ora-y Jul 06 '12 at 18:43
  • 1
    Note that any competent compiler treats `strlen` as an intrinsic nowadays and will optimize away multiple calls. Regardless, the latter form will perform better for reasons mentioned by @R (except for the word "coherency"). – avakar Jul 06 '12 at 21:18
  • 1
    @David Wrong. `strlen` is simply not a pure function, and it is **not** marked with `__attribute__((pure))` in glibc (look it up). However, it’s marked as a compiler intrinsic and will get special treatment. This special treatment needs to be emphasised here. All the answers just saying “yes” are *wrong*, and even this answer misstates the probability of `strlen` being hoisted out of the loop. – Konrad Rudolph Jul 07 '12 at 14:00
  • @KonradRudolph: What are you talking about? `strlen()` **is** marked `__attribute__((pure))` in string.h in the latest Ubuntu libc6-dev (quantal). And http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html says "Some of common examples of pure functions are strlen or memcmp". They shouldn't be (unless gcc means something else by "pure"), but they are. – j_random_hacker Sep 22 '12 at 11:43
  • @j_random_hacker Ah. I had looked at the implementation instead of the declaration. It’s indeed marked as pure in the former. Now I am still waiting for an explanation why it’s pure. It shouldn’t be, as far as I understand. The `pure` is a lie. – Konrad Rudolph Sep 24 '12 at 10:52
  • Why not `for( const char *p = ss ; p ; ++p )`? – bobobobo Aug 29 '13 at 12:02
  • 1
    @bobobobo: If you don't need to know the index in the loop, then why not indeed? (Although the condition should be `*p`, not `p`). – Mike Seymour Aug 29 '13 at 12:13
  • from a readability point of view I think the second solution - although it may work - is not a good thing to do. I prefer to be correct and readable first and to be fast, second. So I'd go with the first solution wherever it's possible and take the second one only where it's very crucial to be super fast in this particular piece of code. Also the comments would have to reflect my decision to write code that's hard to read in favor of a speed gain... – xmoex Jan 27 '15 at 10:53
14

Yes, every time you use the loop. Then it will every time calculate the length of the string. so use it like this:

char str[30];
for ( int i = 0; str[i] != '\0'; i++)
{
//Something;
}

In the above code str[i] only verifies one particular character in the string at location i each time the loop starts a cycle, thus it will take less memory and is more efficient.

See this Link for more information.

In the code below every time the loop runs strlen will count the length of the whole string which is less efficient, takes more time and takes more memory.

char str[];
for ( int i = 0; i < strlen(str); i++)
{
//Something;
}
codeDEXTER
  • 1,181
  • 7
  • 14
  • 3
    I can agree with "[it] is more efficient", but use less memory? The only memory usage difference I can think of would be in the call stack during the `strlen` call, and if you're running that tight, you probably should be thinking about eliding a few other function calls as well... – user Jul 06 '12 at 20:17
  • @MichaelKjörling Well if you use "strlen" , then in a loop it has to scan the whole string every time the loop runs, whereas in the above code the "str[ix]", it only scans one element during each cycle of the loop whose location is represented by "ix". Thus it takes less memory than "strlen". – codeDEXTER Jul 06 '12 at 20:42
  • 1
    I'm not sure that makes a lot of sense, actually. A very naïve implementation of strlen would be something like `int strlen(char *s) { int len = 0; while(s[len] != '\0') len++; return len; }` which is pretty much exactly what you are doing in the code in your answer. I'm not arguing that iterating over the string once rather than twice is more *time*-efficient, but I don't see one or the other using more or less memory. Or are you referring to the variable used to hold the string length? – user Jul 06 '12 at 20:49
  • @MichaelKjörling Please see the above edited code and the link. And as for the memory- every time the loop runs each and every value that iterates is stored in memory and in case of 'strlen' as it counts the whole string again and again it requires more memory to store. and also because unlike Java, C++ has no "Garbage Collector". Then i can be wrong also. see [link](http://stackoverflow.com/questions/147130/why-doesnt-c-have-a-garbage-collector) regarding absence of "Garbage Collector" in C++. – codeDEXTER Jul 06 '12 at 21:55
  • 1
    @aashis2s The lack of a garbage collector only plays a role when creating objects on the heap. Objects on the stack get destroyed as soon as the scope and ends. – Ikke Jul 10 '12 at 19:32
9

A good compiler may not calculate it every time, but I don't think you can be sure, that every compiler does it.

In addition to that, the compiler has to know, that strlen(ss) does not change. This is only true if ss is not changed in for loop.

For example, if you use a read-only function on ss in for loop but don't declare the ss-parameter as const, the compiler cannot even know that ss is not changed in the loop and has to calculate strlen(ss) in every iteration.

Azeem
  • 11,148
  • 4
  • 27
  • 40
Misch
  • 10,350
  • 4
  • 35
  • 49
  • 3
    +1: Not only must `ss` not be changed in the `for` loop; it must not be accessible from and changed by any function called in the loop (either because it is passed as an argument, or because it a global variable or a file-scope variable). Const-qualification may also be a factor, too. – Jonathan Leffler Jul 06 '12 at 15:25
  • 5
    I think it highly unlikely that the compiler could know that 'ss' doesn't change. There could be stray pointers that point to memory inside 'ss' which the compiler has no idea of that could change 'ss' – MerickOWA Jul 06 '12 at 15:25
  • Jonathan is right, a local const string might be the only way for the compiler to be assured there's no way for 'ss' to change. – MerickOWA Jul 06 '12 at 15:27
  • 2
    @MerickOWA: indeed, that's one of the things that `restrict` is for in C99. – Steve Jessop Jul 06 '12 at 15:27
  • @MerickOWA: The compiler can figure that out in some circumstances. e.g. the compiler may assume a freshly allocated block of memory will have no other pointers aliasing it. Also, if the pointer was passed into the function with the `restrict` qualifier. –  Jul 06 '12 at 15:27
  • 4
    Regarding your last para: if you call a read-only function on `ss` in the for-loop, then even if its parameter is declared `const char*`, the compiler *still* needs to recalculate the length unless either (a) it knows that `ss` points to a const object, as opposed to just being a pointer-to-const, or (b) it can inline the function or otherwise see that it is read-only. Taking a `const char*` parameter is *not* a promise not to modify the data pointed to, because it is valid to cast to `char*` and modify provided that the object modified isn't const and isn't a string literal. – Steve Jessop Jul 06 '12 at 15:31
4

If ss is of type const char * and you're not casting away the constness within the loop the compiler might only call strlen once, if optimizations are turned on. But this is certainly not behavior that can be counted upon.

You should save the strlen result in a variable and use this variable in the loop. If you don't want to create an additional variable, depending on what you're doing, you may be ale to get away with reversing the loop to iterate backwards.

for( auto i = strlen(s); i > 0; --i ) {
  // do whatever
  // remember value of s[strlen(s)] is the terminating NULL character
}
Praetorian
  • 106,671
  • 19
  • 240
  • 328
  • 1
    It's a mistake to call `strlen` at all. Just loop until you hit the end. – R.. GitHub STOP HELPING ICE Jul 06 '12 at 17:28
  • `i > 0`? Should that not be `i >= 0` here? Personally, I would also start at `strlen(s) - 1` if iterating over the string backwards, then the terminating `\0` needs no special consideration. – user Jul 06 '12 at 20:19
  • 2
    @MichaelKjörling `i >= 0` works only if you initialize to `strlen(s) - 1`, but then if you have a string on zero length the initial value underflows – Praetorian Jul 06 '12 at 20:27
  • @Prætorian, good point on the zero length string. I didn't consider that case when I wrote my comment. Does C++ evaluate the `i > 0` expression on initial loop entry? If it doesn't, then you're right, the zero length case will definitely break the loop. If it does, you "simply" get a signed `i` == -1 < 0 so no loop entry if the conditional is `i >= 0`. – user Jul 06 '12 at 20:31
  • @MichaelKjörling Yes, the exit condition is evaluated prior to executing the loop for the first time. [`strlen`](http://linux.die.net/man/3/strlen)'s return type is unsigned, so `(strlen(s)-1) >= 0` evaluates to true for zero length strings. – Praetorian Jul 06 '12 at 20:34
  • @Prætorian, good point. Is it any excuse that I rarely see a need in the code I write to iterate character by character over a string? :) – user Jul 06 '12 at 20:37
3

Formally yes, strlen() is expected to be called for every iteration.

Anyway I do not want to negate the possibility of the existance of some clever compiler optimisation, that will optimise away any successive call to strlen() after the first one.

alk
  • 69,737
  • 10
  • 105
  • 255
3

The predicate code in it's entirety will be executed on every iteration of the for loop. In order to memoize the result of the strlen(ss) call the compiler would need to know that at least

  1. The function strlen was side effect free
  2. The memory pointed to by ss doesn't change for the duration of the loop

The compiler doesn't know either of these things and hence can't safely memoize the result of the first call

JaredPar
  • 733,204
  • 149
  • 1,241
  • 1,454
  • Well it *could* know those things with static analysis, but I think your point is that such analysis is currently not implemented in any C++ compilers, yes? – GManNickG Jul 06 '12 at 15:26
  • @GManNickG it could definitely prove #1 but #2 is harder. For a single thread yes it could definitely prove it but not for a multi-threaded environment. – JaredPar Jul 06 '12 at 15:28
  • 1
    Maybe I'm being stubborn but I think number two is possible in multithreaded environments too, but definitely not without a wildly strong inference system. Just musing here though; definitely beyond the scope of any current C++ compiler. – GManNickG Jul 06 '12 at 15:30
  • @GManNickG i don't think it's possible though in C / C++. I could very easily stash the address of `ss` into a `size_t` or divide it up amongst several `byte` values. My devious thread could then just write bytes into that address and the compiler would have know way of understanding that it related to `ss`. – JaredPar Jul 06 '12 at 15:32
  • The compiler needs a stronger condition than (1): `strlen` not only has to be side-effect free, it also has to be pure (returns the same value every time for the same input). But the points stands -- in general the compiler doesn't know either of those things, in specific cases it might know either or both. For example if it sees: `const char ss[] = "hi";`, then although in principle some devious other thread might alter the length of the string contained in `ss`, that would be UB and so the optimizer can assume it doesn't happen. – Steve Jessop Jul 06 '12 at 15:35
  • @JaredPar: If we can understand the relationship, so can the compiler. :) It would see that you've done that, after all. – GManNickG Jul 06 '12 at 15:47
  • @GManNickG what exactly would it see though? I could have an `int` I failed to initialize that is pointing to random memory that just so happens to be the address of `ss`. The compiler can't infer the relationship between the write to that random memory and `ss`. It could only do so if it marked all writes as potentially modifying `ss` which would render the optimization mute. – JaredPar Jul 06 '12 at 15:58
  • No optimizer is going to determine whether the data *actually is* modified, all it can do is identify certain classes of cases for which it certainly isn't. The next step after detecting that `ss` points to (or is) an array of const objects, is escape analysis. If `ss` is either an automatic variable or is declared `restrict` in C, and if no reference is taken to it that leaves code the optimizer can see, then it "hasn't escaped", and so could potentially be proven unmodified even if non-const. This works in multi-threaded environments as well as single-. – Steve Jessop Jul 06 '12 at 15:59
  • @JaredPar: but reading that uninitialized `int` is UB (in C++ anyway: C might require a more sophisticated argument), so the optimizer is not required to account for the possibility. – Steve Jessop Jul 06 '12 at 16:00
  • @JaredPar not sure the uninitialized argument works. Its undefind behavior which means with our without the optimization, theres no guarentee the code does anything correct. – MerickOWA Jul 06 '12 at 16:07
  • 1
    @JaredPar: Sorry to bang on, you could claim that `int a = 0; do_something(); printf("%d",a);` cannot be optimized, on the basis that `do_something()` could do your uninitialized int thing, or could crawl back up the stack and modify `a` deliberately. In point of fact, gcc 4.5 does optimize it to `do_something(); printf("%d",0);` with -O3 – Steve Jessop Jul 06 '12 at 16:09
  • @SteveJessop to be clear I definitely understand that a compiler doesn't go the route GManNickG and I were discussing. I was trying to have a hypothetical argument about determining what would or wouldn't be 100% safe in a multi-threaded scenario. I do realize that compilers approach the problem from the other direction (escape analysis + const detection) as it's part of the work I do on a daily basis. – JaredPar Jul 06 '12 at 16:31
  • @JaredPar: in that case I have nothing to prove to you, I merely don't understand why you say that escape analysis and const detection aren't "100% safe in a multi-threaded scenario" :-) – Steve Jessop Jul 06 '12 at 16:33
  • @SteveJessop as for the last example you posted (the elimination of local `a`). While compilers do this optimization it's not 100% safe. It's a bit of a pathological case but `do_something` could modify the local `a` via stack walking (and disturbingly I've encountered situations where developers intentionally do this and even more disturbing is seeing it in managed code). Yes that behavior of the developer is completely unsafe, insane and yet they still do it. So while a sane optimization it's not a 100% safe one (which is the point I was trying to get at) – JaredPar Jul 06 '12 at 16:34
  • @SteveJessop i wouldn't argue that escape analysis isn't 100% safe (it is). I was arguing the more general case of determining that an arbitrary variable which wasn't const was essentially immutable (it can't be considered so in C). – JaredPar Jul 06 '12 at 16:35
  • The sane optimization is 100% safe provided that the compiler doesn't document that stack walking (or "crawling", as I put it) is guaranteed to work under a particular level of optimization. If `do_something` crawls up the stack into code optimized with O3 (or whatever level introduces this optimization), then it's `do_something` which is <100% safe, not the compiler. That's why we all laughed like drains over that elided null pointer check in the linux kernel: some l33t kernel hacker tripped over his feet by allowing his UB code to be compiled with flags that broke it. – Steve Jessop Jul 06 '12 at 16:36
  • @SteveJessop sure if you constrain the bad cases to "not supported" then yeah you can make more aggressive "safe" operations. – JaredPar Jul 06 '12 at 16:39
  • @SteveJessop: Under the rules of C11, if nothing ever takes the address of `a`, a compiler would be entitled to expect that it won't magically become initialized. The ARM version of GCC on Godbolt fails to properly handle some cases where a local variable has its address taken, however. Using `memcpy` to copy the contents of an uninitialized `uint16_t` variable to another should result in the latter one holding some number in the range 0-65535, but gcc converts the memcpy into an assignment and fails to regard that it must store a value in the 0-65535 range. – supercat Jun 11 '17 at 17:45
2

Yes. strlen will be calculated everytime when i increases.

If you didn't change ss with in the loop means it won't affect logic otherwise it will affect.

It is safer to use following code.

int length = strlen(ss);

for ( int i = 0; i < length ; ++ i )
{
 // blabla
}
Kalai Selvan Ravi
  • 2,846
  • 2
  • 16
  • 28
2

Yes, strlen(ss) will be calculated every time the code runs.

Azeem
  • 11,148
  • 4
  • 27
  • 40
Hisham Muneer
  • 8,558
  • 10
  • 54
  • 79
2

Yes, the strlen(ss) will calculate the length at each iteration. If you are increasing the ss by some way and also increasing the i; there would be infinite loop.

leemes
  • 44,967
  • 21
  • 135
  • 183
Kumar Shorav
  • 531
  • 4
  • 16
2

Yes, the strlen() function is called every time the loop is evaluated.

If you want to improve the efficiency then always remember to save everything in local variables... It will take time but it's very useful ..

You can use code like below:

String str="ss";
int l = strlen(str);

for ( int i = 0; i < l ; i++ )
{
    // blablabla
}
Marc Mutz - mmutz
  • 24,485
  • 12
  • 80
  • 90
Rajan
  • 255
  • 1
  • 5
  • 16
2

Not common nowadays but 20 years ago on 16 bit platforms, I'd recommend this:

for ( char* p = str; *p; p++ ) { /* ... */ }

Even if your compiler isn't very smart in optimization, the above code can result in good assembly code yet.

Luciano
  • 2,695
  • 6
  • 38
  • 53
1

Yes. The test doesn't know that ss doesn't get changed inside the loop. If you know that it won't change then I would write:

int stringLength = strlen (ss); 
for ( int i = 0; i < stringLength; ++ i ) 
{
  // blabla 
} 
DanS
  • 1,677
  • 20
  • 30
1

Arrgh, it will, even under ideal circumstances, dammit!

As of today (January 2018), and gcc 7.3 and clang 5.0, if you compile:

#include <string.h>

void bar(char c);

void foo(const char* __restrict__ ss) 
{
    for (int i = 0; i < strlen(ss); ++i) 
    {
        bar(*ss);
    }
}    

So, we have:

  • ss is a constant pointer.
  • ss is marked __restrict__
  • The loop body cannot in any way touch the memory pointed to by ss (well, unless it violates the __restrict__).

and still, both compilers execute strlen() every single iteration of that loop. Amazing.

This also means the allusions/wishful thinking of @Praetorian and @JaredPar doesn't pan out.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
0

YES, in simple words. And there is small no in rare condition in which compiler is wishing to, as an optimization step if it finds that there is no changes made in ss at all. But in safe condition you should think it as YES. There are some situation like in multithreaded and event driven program, it may get buggy if you consider it a NO. Play safe as it is not going to improve the program complexity too much.

Pervez Alam
  • 1,246
  • 10
  • 20
0

Yes.

strlen() calculated everytime when i increases and does not optimized.

Below code shows why the compiler should not optimize strlen().

for ( int i = 0; i < strlen(ss); ++i )
{
   // Change ss string.
   ss[i] = 'a'; // Compiler should not optimize strlen().
}
Amir Saniyan
  • 13,014
  • 20
  • 92
  • 137
  • I think doing that particular modification never alters the length of ss, just its contents, so (a really, really clever) compiler could still optimize `strlen`. – Darren Cook Jul 12 '12 at 00:25
0

We can easily test it :

char nums[] = "0123456789";
size_t end;
int i;
for( i=0, end=strlen(nums); i<strlen(nums); i++ ) {
    putchar( nums[i] );
    num[--end] = 0;
}

Loop condition evaluates after each repetition, before restarting the loop .

Also be careful about the type you use to handle length of strings . it should be size_t which has been defined as unsigned int in stdio. comparing and casting it to int might cause some serious vulnerability issue.

Rsh
  • 7,214
  • 5
  • 36
  • 45
0

well, I noticed that someone is saying that it is optimized by default by any "clever" modern compiler. By the way look at results without optimization. I tried:
Minimal C code:

#include <stdio.h>
#include <string.h>

int main()
{
 char *s="aaaa";

 for (int i=0; i<strlen(s);i++)
  printf ("a");
 return 0;
}

My compiler: g++ (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
Command for generation of assembly code: g++ -S -masm=intel test.cpp

Gotten assembly code at the output:
    ...
    L3:
mov DWORD PTR [esp], 97
call    putchar
add DWORD PTR [esp+40], 1
    .L2:
     THIS LOOP IS HERE
    **<b>mov    ebx, DWORD PTR [esp+40]
mov eax, DWORD PTR [esp+44]
mov DWORD PTR [esp+28], -1
mov edx, eax
mov eax, 0
mov ecx, DWORD PTR [esp+28]
mov edi, edx
repnz scasb</b>**
     AS YOU CAN SEE it's done every time
mov eax, ecx
not eax
sub eax, 1
cmp ebx, eax
setb    al
test    al, al
jne .L3
mov eax, 0
     .....
Tebe
  • 3,176
  • 8
  • 40
  • 60
  • I would be loath to trust any compiler that tried to optimize it unless the address of the string was `restrict`-qualified. While there are some cases where such optimization would be legitimate, the effort required to reliably identify such cases in the absence of `restrict` would, by any reasonable measure, almost certainly exceed the benefit. If the string's address had a `const restrict` qualifier, however, that would be sufficient in and of itself to justify the optimization without having to look at anything else. – supercat Jun 11 '17 at 17:50
0

Elaborating on Prætorian's answer I recommend the following:

for( auto i = strlen(s)-1; i > 0; --i ) {foo(s[i-1];}
  • auto because you don't want to care about which type strlen returns. A C++11 compiler (e.g. gcc -std=c++0x, not completely C++11 but auto types work) will do that for you.
  • i = strlen(s) becuase you want to compare to 0 (see below)
  • i > 0 because comparison to 0 is (slightly) faster that comparison to any other number.

disadvantage is that you have to use i-1 in order to access the string characters.

steffen
  • 8,572
  • 11
  • 52
  • 90