What is the fundamental difference, if any, between a C++ std::vector and std::basic_string?
-
1Check the docs, they have different interfaces. If you specified the actual problem you are solving then the answers could have also been more specific. – Gene Bushuyev Dec 29 '10 at 19:12
-
1@Gene: They do have different interfaces, but both implement everything necessary to be an STL sequence container. – Billy ONeal Dec 29 '10 at 19:17
-
1@Gene: I'm not solving any particular problem, I'm just curious why I should choose one or the other for various purposes: I'm not counting the existence of some additional string like methods as fundamental. I don't really count performance as fundamental either. However validity of iterators several replies mention definitely is. And I have a vague suspicion the data type for a string must have a "Zero like" value to put on the end of data() method (got from a traits thingy). – Yttrill Dec 29 '10 at 20:06
-
4FYI: string originally wasn't an STL container. Against the advice of Pete Becker, the whole rest of the committee decided to make it one. This made it like vector, and removed the possibility of many optimisations. In retrospect I think Pete Becker was actually right. – Yttrill Dec 29 '10 at 20:10
-
@Yttrill Was this intended as a discussion only question? It seems that some of these answers are pretty through, is there a reason that none of them have been accepted? – Jonathan Mee Feb 22 '16 at 16:05
-
I was looking for fundamental differences. The non-existence of element destructor calls, for example, is not really good enough because vector could do that too with a suitable specialisation and traits information, in fact it may be irrelevant because a calling a trivial destructor in modern C++ compiler should optimise away (and then, the loop optimise away). – Yttrill Feb 23 '16 at 23:45
-
Invalidation of iterators on swap is fundamental technically but doesn't really seem important (how often do you swap anything?). Conceptually if a container has random access iterators AND has to be stored contiguously, the iterators must be (isomorphic to) pointers and all such containers are then arrays. – Yttrill Feb 23 '16 at 23:57
-
In fact given C++11 constraints I would expect the primary difference is that a string is NOT converted when outputting to a suitable stream, i.e. the elements represent themselves and are output contiguously. Whereas a vector might print "vector(char(63), char(64), char(65))" instead of "ABC". – Yttrill Feb 24 '16 at 00:00
7 Answers
basic_string doesn't call constructors and destructors of its elements. vector does.
swapping basic_string invalidates iterators (enabling small string optimization), swapping vectors doesn't.
basic_string memory may not be allocated continuously in C++03. vector is always continuous. This difference is removed in C++0x [string.require]:
The char-like objects in a basic_string object shall be stored contiguously
basic_string has interface for string operations. vector doesn't.
basic_string may use copy on write strategy (in pre C++11). vector can't.
Relevant quotes for non-believers:
[basic.string]:
The class template basic_string conforms to the requirements for a Sequence Container (23.2.3), for a Reversible Container (23.2), and for an Allocator-aware container (Table 99), except that basic_string does not construct or destroy its elements using allocator_traits::construct and allocator_- traits::destroy and that swap() for basic_string invalidates iterators. The iterators supported by basic_string are random access iterators (24.2.7).

- 70,775
- 16
- 139
- 220
-
As far as I can tell, that quote isn't from 'the standard' but from the C++0x draft. That's fine and worth mentioning, but you need to qualify it with "In C++0x...". – GManNickG Dec 29 '10 at 19:55
-
@GMan: Hmm, seems like you're right. But only the formulation is different, C++98 still says that it only allocates and deallocates the elements. This change is just a clarification. – Yakov Galka Dec 29 '10 at 20:39
-
So is there a consensus on this? Individual ctor/dtor/mov/assign ops are likely to be more expensive than bitblits, so there's good reason to desire string elements be PODs, but is it required? Couldn't the POD and non-POD cases be split with specialisations based on a trait? Or at least common cases eg char? Couldn't this be done for vector too? – Yttrill Dec 29 '10 at 20:51
-
@BillyONeal Incidentally, there's only one command for default allocation of elements that `basic_string` provides, and that's [`resize`](http://en.cppreference.com/w/cpp/string/basic_string/resize), which, "initializes new characters to `CharT()`". An example for the non-believers: http://ideone.com/UjQzrM – Jonathan Mee Feb 18 '16 at 17:43
-
1@Johnathan Mee: Your use of `basic_string` is nonconforming, so you get undefined behavior; which happens to be not calling copy ctors in this case. `basic_string` is only valid for types with a `char_traits
`, and `char_traits – Billy ONeal Feb 18 '16 at 20:54::copy`, for example, is not required to call constructors. (All of the required-to-be-provided char_traits specializations can conformingly implement `char_traits ::copy` as a call to `memcpy`, since that's valid for char/wchar_t/char16_t/char32_t)
basic_string
gives compiler and standard library implementations, a few freedoms over vector:
The "small string optimization" is valid on strings, which allows implementations to store the actual string, rather than a pointer to the string, in the string object when the string is short. Something along the lines of:
class string { size_t length; union { char * usedWhenStringIsLong; char usedWhenStringIsShort[sizeof(char*)]; }; };
In C++03, the underlying array need not be contiguous. Implementing
basic_string
in terms of something like a "rope" would be possible under the current standard. (Though nobody does this because that would make the membersstd::basic_string::c_str()
andstd::basic_string::data()
too expensive to implement.)
C++11 now bans this behavior though.In C++03,
basic_string
allows the compiler/library vendor to use copy-on-write for the data (which can save on copies), which is not allowed forstd::vector
. In practice, this used to be a lot more common, but it's less common nowadays because of the impact it has upon multithreading. Either way though, your code cannot rely on whether or notstd::basic_string
is implemented using COW.
C++11 again now bans this behavior.
There are a few helper methods tacked on to basic_string
as well, but most are simple and of course could easily be implemented on top of vector
.

- 104,103
- 58
- 317
- 552
-
I don't like to think of (1) as a reason for picking or using std::string it is an unintended side affect of the standard wording that has been tightened in the new standard. (2) was a good reason for using std::string as it made returning strings from methods very efficient (practically no cost) unfortunately that is being removed because of requirements for parallelism (though from reading the paper that recommends this; it looks like rope will take up that mantel of COW eventually (we will have to wait and see if this works out). – Martin York Dec 29 '10 at 21:43
-
@Martin: This is true. OTOH move semantics gets rid of LOTS of the cases where COW implementations sped things up :) – Billy ONeal Dec 29 '10 at 21:46
The key difference is that std::vector
should keep its data in continuous memory, when std::basic_string
could not to. As a result:
std::vector<char> v( 'a', 3 );
char* x = &v[0]; // valid
std::basic_string<char> s( "aaa" );
char* x2 = &s[0]; // doesn't point to continuous buffer
//For example, the behavior of
std::cout << *(x2+1);
//is undefined.
const char* x3 = s.c_str(); // valid

- 97,037
- 24
- 136
- 212
-
Err.. that code example is perfectly valid. Now if you modified `x2` using pointer arithmetic it's possible that it wouldn't be valid (depending on your compiler), but I'm unaware of any compiler who does this. – Billy ONeal Dec 29 '10 at 19:18
-
I'm not aware of such compiler too, but you shouldn't count on that `x2` points to the continuous buffer, because C++ Standard doesn't give any guarantees. – Kirill V. Lyadvinsky Dec 29 '10 at 19:21
-
2Neither of them are valid actually. Neither object has any size and so the first element is actually one past the end and you can't dereference that element. Only the c_str() one is well defined. – Edward Strange Dec 29 '10 at 19:21
-
@Kirill: My point is that your example doesn't demonstrate the continuous buffer aspect. If someone interprets `x2` as a pointer to a single character (rather than a pointer to a null terminated C string) then the code is perfectly valid. For example, someone could do `std::cout << *x2` and everything would be fine, but `std::cout << *(x2 + 1)` would be invalid. @Noah: Lol -- good point. – Billy ONeal Dec 29 '10 at 19:22
-
1@Noah, updated the example. The point wasn't about initialization of these containers so I've skipped it. – Kirill V. Lyadvinsky Dec 29 '10 at 19:25
-
Well, this key difference is removed in C++0x, see my answer, so it's not so key-difference after all. – Yakov Galka Dec 29 '10 at 19:35
-
@Billy, thanx, with your edit the example is perfectly clear demonstrates the issue. – Kirill V. Lyadvinsky Dec 29 '10 at 19:41
-
That's completely minor (and unintended consequence). The loop hole has been closed in C++0x and no STL (tested by the committee during the processes of updating the standard) would break because of the assumption on contiguous data. – Martin York Dec 29 '10 at 21:39
-
@Kirill: are you aware of any implementation in which strings are not contiguous? When they discussed the change to `string` for C++0x, the committee wasn't. It's significant in the standard, because it encourages slightly different usages of vector vs. string (for example, you might choose not to use a string as a read buffer if you're worried about portability), but it's not a difference between actual implementations. – Steve Jessop Dec 29 '10 at 23:58
-
@Steve: Related: http://stackoverflow.com/questions/2256160/how-bad-is-code-using-stdbasic-stringt-as-a-contiguous-buffer – Billy ONeal Dec 30 '10 at 00:52
TLDR: string
s are optimized to only contain character primitives, vector
s can contain primitives or objects
The preeminent difference between vector
and string
is that vector
can correctly contain objects, string
works only on primitives. So vector
provides these methods that would be useless for a string
working with primitives:
Even extending string
will not allow it to correctly handle objects, because it lacks a destructor. This should not be viewed as a drawback, it allows significant optimization over vector
in that string
can:
- Do short string optimization, potentially avoiding heap allocation, with little to no increased storage overhead
- Use
char_traits
, one ofstring
's template arguments, to define how operations should be implemented on the contained primitives (of which onlychar
,wchar_t
,char16_t
, andchar32_t
are implemented: http://en.cppreference.com/w/cpp/string/char_traits)
Particularly relevant are char_traits::copy
, char_traits::move
, and char_traits::assign
obviously implying that direct assignment, rather than construction or destruction will be used which is again, preferable for primitives. All this specialization has the additional drawbacks to string
that:
- Only
char
,wchar_t
,char16_t
, orchar32_t
primitives types will be used. Obviously, primitives of sizes up to 32-bit, could use their equivalently sizedchar_type
: https://stackoverflow.com/a/35555016/2642059, but for primitives such aslong long
a new specialization ofchar_traits
would need to be written, and the idea of specializingchar_traits::eof
andchar_traits::not_eof
instead of just usingvector<long long>
doesn't seem like the best use of time. - Because of short string optimization, iterators are invalidated by all the operations that would invalidate a
vector
iterator, butstring
iterators are additionally invalidated bystring::swap
andstring::operator=
Additional differences in the interfaces of vector
and string
:
- There is no mutable
string::data
: Why Doesn't std::string.data() provide a mutable char*? string
provides functionality for working with words unavailable invector
:string::c_str
,string::length
,string::append
,string::operator+=
,string::compare
,string::replace
,string::substr
,string::copy
,string::find
,string::rfind
,string::find_first_of
,string::find_first_not_of
,string::flind_last_of
,string::find_last_not_of
,string::operator+
,string::operator>>
,string::operator<<
,string::stoi
,string::stol
,string::stoll
,string::stoul
,string::stoull
,string::stof
,string::stod
,string::stold
,stirng::to_string
,string::to_wstring
- Finally everywhere
vector
accepts arguments of anothervector
,string
accepts astring
or achar*
Note this answer is written against C++11, so string
s are required to be allocated contiguously.

- 1
- 1

- 37,899
- 23
- 129
- 288
One difference between std::string
and std::vector
is that programs may construct a string from a null-terminated string, whereas with vectors they cannot.
std::string a = "hello"; // okay
std::vector<char> b = "goodbye"; // compiler error
This often makes strings easier to work with.

- 3,354
- 5
- 25
- 37
A vector is a data structure which simulates an array. Deep inside it is actually a (dynamic) Array.
The basic_string class represents a Sequence of characters. It contains all the usual operations of a Sequence, and, additionally, it contains standard string operations such as search and concatenation.
You can use vector to keep whatever data type you want std::vector<int> or <float> or even std::vector< std::vector<T> >
but a basic_string
can only be used for representing "text".
-
I'm sure I could find a way to make a `basic_string` full of doubles or something equally evil and have it actually "work". – Edward Strange Dec 29 '10 at 19:15
-
Well, most implementations implement `basic_string` in terms of something like `vector`. And any type for which a `char_traits` class is defined will work with `std::basic_string`, even if you made up a `char_traits
` (as written in @Noah's comment) – Billy ONeal Dec 29 '10 at 19:16 -
@Billy - not true. Some of the same techniques may very well be used in both but I've NEVER seen a `basic_string` implemented in terms of `vector`. In fact, one very common implementation, that distributed with MSVC++, is VERY different since they use the small string optimization (anything small enough to fit in a pointer is just stuck in the pointer to buffer rather than allocating one). – Edward Strange Dec 29 '10 at 19:17
-
@Noah: If you read my comment again, it says "Something like vector", not "vector" -- those two words are important. (What I meant was it's usually some form of dynamic array) – Billy ONeal Dec 29 '10 at 19:19
-
Not wishing to answer my own question yet .. but what about the new string types such as UTF-8 string things? These would be less array like, would they not? How do they fit in with basic_string? – Yttrill Dec 29 '10 at 20:19
-
... I'm not aware of any plans for a string type that stores the characters in UTF-8 (or any other variable-width) encoding in memory. It's a bad idea in terms of performance; you lose random access to characters. – Karl Knechtel Dec 29 '10 at 20:24
The basic_string provides many string-specific comparison options. You are right in that the underlying memory management interface is very similar, but string contains many additional members, like c_str(), that would make no sense for a vector.

- 144,682
- 38
- 256
- 465
-
I don't think there's any such thing as a vector using small string optimization. Haven't gone out looking for one but I'm fairly sure it's not out there. It wouldn't be as useful. Truth is that the two things are just plain different. They have different purposes and so are often implemented VERY differently though a naive approach may be similar in both. – Edward Strange Dec 29 '10 at 19:20
-
-
Which comparisons are string specific? Given the element data type is variable in both cases, things like lexicographical comparison make as much sense for vectors as strings. Of course a case insensitive comparison would be string specific, but then string wouldn't be polymorphic. – Yttrill Dec 29 '10 at 20:22