29

Is std::string size() a O(1) operation?

The implementation of STL I'm using is the one built into VC++

Brian R. Bondy
  • 339,232
  • 124
  • 596
  • 636

8 Answers8

37

If you're asking if MSVC's implementation of string::size() has constant complexity, then the answer is yes. But Don Wakefield mentioned Table 65 in 23.1 of the C++ Standard where it says that the complexity of size() should follow what's said in 'Note A'. Note A says:

Those entries marked ‘‘(Note A)’’ should have constant complexity.

However, that does not mean that those entries shall have constant complexity. Standards use very specific terminology, and "should" means that it is not mandatory.

'Note A' was added to the standard specifically to appease those who believed that size() should be allowed to have linear complexity so it would not be necessary to keep the size when the containers were modified.

So you can't rely on size() having constant complexity, but I'm honestly not sure if there are any implementations that do not have a constant string::size().

Gabriel Ravier
  • 358
  • 1
  • 6
  • 17
Michael Burr
  • 333,147
  • 50
  • 533
  • 760
  • 36
    As a side note, it is nonetheless possible to retrieve size of the string as an O(1) operation in a standard way: `(end() - begin())`. This is guaranteed to be \[amortized\] O(1) because both `begin()` and `end()` must be O(1) (Container requirements), string iterators are random-access (string requirements), and `operator-` must be O(1) for iterators that support it (iterator requirements). – Pavel Minaev Jul 12 '09 at 08:22
  • 2
    ... and the length of the string cannot be calculated from the contents, as there is no restriction on the values that can be stored. Which means that it has to be managed externally to the contents, and thus is independent of N. i.e. it must be *O(1)*. The fact that `O(1)` is not a requirement to *all* containers does not mean that it can be `O(N)` in the containers, only that there can be at least one container for which it is non-constant, for example `std::list<>`. – David Rodríguez - dribeas Oct 27 '11 at 10:21
  • @DavidRodríguez-dribeas: Hmmm, it may be possible to use an "encoding" of a string to avoid explicitly storing the length anywhere. E.g. use a 0-byte as an escape character whose meaning depends on the following byte: 0 means "end of string", nonzero means "that many 0-bytes". It may be that other `std::string` requirements forbid this however (e.g. this seems to imply array subscripting can no longer be O(1)). – j_random_hacker Nov 17 '12 at 05:46
  • 3
    @j_random_hacker: Are you talking about hypothetical language or C++? In a hypothetical language I guess you could. In C++ the 03 standard when describing `std::basic_string<>::size()` it refers to the container requirements (i.e. recommended constant size), in the 11 standard the requirement is explicit: *Complexity: constant time*. – David Rodríguez - dribeas Nov 17 '12 at 23:05
  • 2
    @DavidRodríguez-dribeas: You're right! In the 03 standard, 21.3/2 says "The class template `basic_string` conforms to the requirements of a Sequence, as specified in (23.1.1)", and a Sequence is a type of container (23.1.1/1), and a container mandates constant-time `begin()` and `end()` (Table 65 in 23.1) Glad to hear they made it explicit in C++11. – j_random_hacker Nov 18 '12 at 05:48
  • @PavelMinaev strings are not containers. this is a famous standardization quirk. I don't believe your point stands because of that. – v.oddou Jan 10 '19 at 02:43
  • @PavelMinaev according to this https://stackoverflow.com/a/17259940/893406 I'm in the wrong. Maybe I'm confused with vector... don't mind me then. – v.oddou Jan 10 '19 at 02:47
8

Yes, std::string::size() is O(1).

lacker
  • 5,470
  • 6
  • 36
  • 38
8

Here's an easy way to answer that question for msvc++.

Write some code in a project:

string happy;
happy.size();

Hilight the .size call, right-click, go to definition.

On my install (vs2005sp1) this sends me to xstring:1635, which looks like this:

size_type __CLR_OR_THIS_CALL size() const
    {   // return length of sequence
    return (_Mysize);
    }

So it looks like the string has a member called _Mysize, and it's just returning that.

In other words, this is a O(1) implementation.

tfinniga
  • 6,693
  • 3
  • 32
  • 37
  • But the question is how is this _Mysize calculated? – Palak Jain Dec 14 '19 at 17:03
  • It’s a data member, so it is not calculated in the .size() function, it is simply returned. It is changed when the string is set, cleared, appended to, etc. – tfinniga Dec 14 '19 at 22:38
5

See Table 65 in Section 23.1 of the Standard. "a.size()" is listed as "(Note A)", which says that "Those entries ... should have constant complexity".

Section 21.3 says that strings conform to the requirements of a sequence (23.1), ipso facto, size() is constant time.

Don Wakefield
  • 8,693
  • 3
  • 36
  • 54
5

For a string, the size() operation has to be constant for all string implementations that don't use ropes(1). There is no explicit requirement in the standard that requires the operation to be O(1), the closest is the generic requirement that size() should be constant time, but that leaves room for any other complexity measure.

So why must it be O(1)?

This comes from the fact that the size cannot be calculated from the contents of the string itself. While in C you use a NUL terminator to determine the end of the string, in C++ NUL is as valid as any other character in the string. Since the size of the string cannot be calculated from the contents(2), it must be handled externally, independently of the actual size of the string.

(1) C++03 standard allows an implementation to use ropes as the implementation for strings, but the fact is that none of the current implementations of the standard libraries use them.

(2) If the implementation used ropes, the operation could depend on the size by means of the number of blocks from which the rope was constructed if the blocks were linked through a linked list or similar construct, or if they were allowed to have different sizes. But ropes are not used in any standard library implementation that I know of.

David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489
2

Performance is guaranteed by the STL to be at least O(N) for containers, however many containers including std::string can implement this as O(1) and will. Usually it'll either return a simple variable or do something like _End - _Begin and return that.

Jasper Bekkers
  • 6,711
  • 32
  • 46
0

The complexity of size() should follow 'Note A'. That means, it should have the constant complexity, ie O(1). Although, I am not sure of the implementation, but the iterators in C++ do have associated operations like begin() and end() that point to the STL containers. These operations have a constant time complexity as they are the container requirements. This implies that begin() - end() also has constant complexity. That means, size() is O(1) operation.

Palak Jain
  • 664
  • 6
  • 19
-2
size_type __CLR_OR_THIS_CALL size() const

{   // return length of sequence

    return (_Mysize);

}

So it eventually might be like this, but you can never be sure.

  • 1
    I think you were voted down because you didn't answer the question, you required the reader to connect the dots. And also your answer is cryptic. – mxcl Nov 01 '08 at 20:59
  • I don't see any rephrasing. To me, the answer is still cryptic. "... it eventually 'might be' like this"? Wasn't that code taken from the actual implementation? – Chap Jul 10 '13 at 18:05