6

In scripting languages like PHP having a for loop like this would be a very bad idea:

string s("ABCDEFG");
int i;
for( i = 0; i < s.length(); i ++ )
{
   cout << s[ i ];
}

This is an example, i'm not building a program like this. (For the guys that feel like they have to tell me why this piece of code <insert bad thing about it here>)

If this C++ example was translated to a similar PHP script the lenght of the string would be calculated every loop cycle. That would cause an enormous perfomance loss in realistic scripts.

I thought the same would apply to C++ programs but when I take a look at tutorials, several open-source libraries and other pieces of code I see that the limiter for the loop isn't precalculated.

  • Should I precalculate the lenght of the string s?
  • Why isn't the limiter always precalculated? (seen this in tutorials and examples)
  • Is there some sort of optimization done by the compiler?
  • Tutorials are generally written for readability rather than speed. It's not wrong to write clearer, slower code in an example. – Martin Beckett Mar 01 '10 at 18:21

16 Answers16

11

It's all relative.

PHP is interpreted, but if s.length drops into a compiled part of the PHP interpreter, it will not be slow. But even if it is slow, what about the time spent in s[i], and what about the time spent in cout <<?

It's really easy to focus on loop overhead while getting swamped with other stuff.

Like if you wrote this in C++, and cout were writing to the console, do you know what would dominate? cout would, far and away, because that innocent-looking << operator invokes a huge pile of library code and system routines.

Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
4

You should learn to justify simpler code. Try to convince yourself that sooner or later you will replace string::length implementation to more optimized one. (Even though your project will most likely miss all deadlines, and optimizing string::length will be the least of your problems.) This kind of thinking will help you focus on things that really matter, even though it's not always easy...

AareP
  • 2,355
  • 3
  • 21
  • 28
3

It depends on how the string is implemented.

On null terminated strings you have to calculate the size on every iteration.

std::string is a container and the size should be returned in O(1) time,
it depends (again) on the implementation.

Nick Dandoulakis
  • 42,588
  • 16
  • 104
  • 136
2

The optimizer may indeed be able to optimize the call to length away if he's able to determine that its value won't change - nevertheless, you're on the safe side if you precalculate it (in many cases, however, optimization won't be possible because it's not clear to the compiler whether the condition variable could possible be changed during the loop).

In many cases, it just doesn't matter because the loop in question is not performance-relevant. Using the classic for(int i=0; i < somewhat(); ++i) is both less work to type and easier to read than for(int i=0,end=somewhat(); i < end; ++i.

Note that the C++ compiler will usually inline small functions, such as length (which would usually retrieve a precalculated length from the string object). Interpreted scripting languages usually need a dictionary lookup for a function call, so for C++ the relative overhad of the redundant check once per loop iteration is probably much smaller.

Alexander Gessler
  • 45,603
  • 7
  • 82
  • 122
2

You're correct, s.length() will normally be evaluated on every loop iteration. You're better off writing:

size_t len = s.length();
for (size_t i = 0; i < len; ++i) {
   ...
}

Instead of the above. That said, for a loop with only a few iterations, it doesn't really matter how often the call to length() will be made.

Timo Geusch
  • 24,095
  • 5
  • 52
  • 70
  • 3
    NOTE: only do it this way if you *know* the length CANNOT change. – David Oneill Mar 01 '10 at 17:51
  • 2
    Um, are you talking about `std::string`? Because I'm pretty sure `std::string` keeps the length as a property of the string rather than recalculating it every time. – Michael Myers Mar 01 '10 at 17:51
  • @mmyers: Yes, it's a "property", but it's still a function call. The compiler might be smart enough to inline the function call, it might not. The compiler can do either action and still conform to standard. – Billy ONeal Mar 01 '10 at 17:53
  • 1
    @Billy: but what it *may* do is really quite irrelevant when every common compiler does this optimization routinely (different rules of course apply for embedded systems; but then, they always do). We’re talking about micro-optimization, not standards compliance. – Konrad Rudolph Mar 01 '10 at 18:00
  • @Konrad It could depend on the optimization level you're using when compiling or the contents of your loop (even if the _programmer_ can determine that the string won't change doesn't mean that the compiler can - especially if the loop gets quite complicated). There's no guarantee that the optimization will take place. It's just very likely that it will. – Jonathan M Davis Mar 01 '10 at 18:02
  • 2
    "You're better off..." How, exactly?! You don't even need to waste brain-cycles thinking about this until you've measured the impact on your actual program. – Daniel Earwicker Mar 01 '10 at 18:04
  • @Jonathan: the point of contention here is whether the call will be *inlined*, not whether the length value will be cached in a register (or whatever) instead of being reloaded in every cycle. And it’s not “just very likely” that this optimization will take place – it **will** take place, end of story. – Konrad Rudolph Mar 01 '10 at 18:09
  • @Earwicker If you want the most likely scenario where the you won't incur the penalty of the extra function call, and you don't want to take the time to test whether one is faster than the other, you will benefit by just saving the size. It won't cost you anything, and it guarantees that you won't keep calling length() over and over. Sitting there thinking about it won't make you better off. Just always precalculating and not worrying about it can. – Jonathan M Davis Mar 01 '10 at 18:11
  • @Konrad Most likely. But if you're compiler isn't inlining (and I don't think that gcc does short of -O3 - certainly that's when it starts using the -inline flag), then it won't inline. If you're compiler is doing inline optimizations, then yes it _will_ happen as long as length() is declared inline - which presumably it is. But if inlining isn't turned on in your compiler, then you're going to incur the cost. – Jonathan M Davis Mar 01 '10 at 18:13
2

I don't know about php but I can tell what c++ does. Consider:

std::string s("Rajendra");
for (unsigned int i = 0; i < s.length(); i++)
{
    std::cout << s[i] << std::endl;
}

If you go for looking up definition of length() (right click on length() and click on "Go To Definition") OR if you are using Visual Assist X then placing the cursor on length() and press Alt+G, you will find following:

size_type __CLR_OR_THIS_CALL length() const
    {   // return length of sequence
    return (_Mysize);
    }

Where _Mysize is of type int, which clearly reveals that length of the string is pre-calculated and only stored value is being returned each call to length().

However,IMPO (in my personal opinion), this coding style is bad and should be best avoided. I would prefer following:

std::string s("Rajendra");
int len = s.length();
for (unsigned int i = 0; i < len; i++)
{
    std::cout << s[i] << std::endl;
}

This way, you will save the overhead of calling length() function equal to length of the string number of times, which saves pushing and popping of stack frame. This can be very expensive when your string is large.
Hope that helps.

Rajendra Uppal
  • 19,218
  • 15
  • 59
  • 57
1
  1. Probably.
  2. For readability.
  3. Sometimes. It depends on how good it is at detecting that the length will not change inside the loop.
Derrick Turk
  • 4,246
  • 1
  • 27
  • 27
1

Short answer, because there are situations where you want it called each time.

someone else's explanation: http://bytes.com/topic/c/answers/212351-loop-condition-evaluation

kfh
  • 321
  • 1
  • 6
0

Well - as this is a very common scenario, most compilers will precalculate the value. Especially when looping through arrays and very common types - string might be one of them.

In addition, introducing an additional variable might destroy some other loop optimizations - it really depends on the compiler you use and might change from version to version.
Thus in some scenarious, the "optimization" could backfire.

If the code is non real "hot-spot" where every tick of performance does matter, you should write it as you did: No "manual" precalculation.

Readability is also very important when writing code!
Optimizations should be done very carefully and only after intensive profiling!

Matthias
  • 12,053
  • 4
  • 49
  • 91
0

std::string.length() returns fixed variable, that stores in container. It is already precalculated

zabulus
  • 2,373
  • 3
  • 15
  • 27
  • 3
    Yes, but the compiler doesn't necessarily have to be smart enough to inline the function call. – Billy ONeal Mar 01 '10 at 17:51
  • Not necessarily precalculated: that's not mandated by the standard. But it **is** very likely to be the case. See http://stackoverflow.com/questions/256033/is-stdstring-size-a-o1-operation – Pontus Gagge Mar 01 '10 at 17:53
  • Those entries marked ‘‘(Note A)’’ should have constant complexity. – zabulus Mar 03 '10 at 07:37
0

You could precalculate the length of the string only if you KNOW the string won't ever change inside the loop.

I don't know why it is done this way in tutorials. A few guesses:
1) To get you in the habit so you don't get hosed when you are changing the value of the string inside the loop.
2) Because it is easier to understand.

Yes, the optimizer will try to improve this, if it can determine if the string won't change

David Oneill
  • 12,502
  • 16
  • 58
  • 70
  • 1
    Premature optimization is the root of all evil. /D Knuth. (Not my downvote, BTW, though I can symphatize with the sentiment.) – Pontus Gagge Mar 01 '10 at 17:57
  • Yeah, I agree with that quote. I rephrased my answer to be a little more clear. – David Oneill Mar 01 '10 at 18:05
  • @Pontus Perhaps, but this is one of those things that you could just get into the habit of doing and not worry about it as long as you're not doing anything which would change the length of the string. You just always precalculate the size and forget about it. It doesn't cost you anything to do so. It's arguably a good habit to get into. Constantly worrying about it would be bad, but just because someone chooses to always precalculate the size in such cases doesn't mean that they're doing anything bad. It's when preoptimizing takes time or involves tradeoffs that it's really a problem. – Jonathan M Davis Mar 01 '10 at 18:05
  • @Jonathan: ...but however minuscule the time it takes to code initially, the results of the premature optimization is code that is just a little bit more complex and has another failure mode. Mutliplied by the number of for-loops over a bounded data structure, this means maintainability takes a hit - and for what gain? – Pontus Gagge Mar 02 '10 at 08:38
  • @Pontus If you really think that precalculating the size reduces maintainability (which I don't agree with), then you have a tradeoff which may make it not worthwhile do it. However, you do get a potential performance gain with no performance cost in most cases, and in some cases (lists) it's pretty much a guaranteed gain (depending on the implementation), so just always precalculating the size would reduce how much you'd have to worry about it. But if you really do view it as a maintainability issue, then it does become a tradeoff and can be viewed as a premature optimization and undesirable. – Jonathan M Davis Mar 02 '10 at 19:30
0

The compiler may be able to save the result of the call and optimize away all of the extra function calls, but it may not. However, the cost of the function call will be quite low since all it has to do is return an int. On top of that, there's a good chance that it will be inlined, removing the cost of the function call altogether.

However, if you really care, you should profile your code and see whether precalculating the value makes it any faster. But it wouldn't hurt to just choose to precalculate it anyway. It won't cost you anything. Odds are, in most cases though, it doesn't matter. There are some containers - like list - where size() might not be an O(1) operation, and then precalculating would be a really good idea, but for most it probably doesn't matter much - especially if the contents of your loop are enough to overwhelm the cost of such an efficient function call. For std::string, it should be O(1), and will probably be optimized away, but you'd have to test to be sure - and of course things like the optimization level that you compile at could change the results.

It's safer to precalculate but often not necessary.

Jonathan M Davis
  • 37,181
  • 17
  • 72
  • 102
0

In this particular case, std::string.length() is usually (but not necessarily) a constant-time operation and usually pretty efficient.

For the general case, the loop termination condition can be any expression, not just a comparison of the loop index variable (indeed, C/C++ does not recognize any particular index, just an initializer expression, a loop test expression and a loop counter expression (which just is executed every time through). The C/C++ for loop is basically syntactic sugar for a do/while compound statement.

Community
  • 1
  • 1
Pontus Gagge
  • 17,166
  • 1
  • 38
  • 51
0

std::sting::length() returns a precalculated value. Other stl containers recalculate their size every time you call the method size() e.g

  • std::list::size() recalculates the size and
  • std::vector::size() returns a precalculated value

It depends on how the internal storage of the container is implemented. std::vector is an array with capacity 2^n and std::list is an linked list.

nutario
  • 773
  • 1
  • 9
  • 15
0

Just for information, on my computer, g++ 4.4.2, with -O3, with the given piece of code, the function std::string::length() const is called 8 times.

I agree it's precalculated and the function is very likely to be inlined. Still very interesting to know when you are using your own functions/class.

Ben
  • 7,372
  • 8
  • 38
  • 46
0

If your program performs a lot of operations on strings all the time like copying strings ( with memcpy ) then it makes sense to cache the length of the string.

A good example of this I have seen in opensource code is redis.

In sds.h ( Simple Dynamic Strings ) look at sdshdr structure:

struct sdshdr {
    long len;
    long free;
    char buf[];
}; 

As you can see it caches the length ( in len field ) of the character array buf and also the free space available ( in free field ) after the null character.

So the total memory allocated to buf is

len + free

This saves a realloc in case the buf needs to be modified and it fits in the space already available.

ardsrk
  • 2,407
  • 2
  • 21
  • 33