4

C strings are char arrays.

Vectors are the new arrays in C++.

Why aren't strings vectors (of chars) then?

Most of the methods of vectors and strings seem duplicated too. Is there a reason for making strings a different thing in C++?

Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
  • Related: http://stackoverflow.com/questions/19730488/stl-is-a-string-a-vector – Yu Hao Jul 14 '15 at 15:47
  • vectors are not the new arrays in C++. arrays are the new arrays in C++ ;). More precisely: `std::array` is what comes closest to a c-style array – 463035818_is_not_an_ai Jul 14 '15 at 15:49
  • simple answer short: subclassing models a "is-a" relationship. A vector can contain any sort of elements. A string cannot contain any sort of elements. Ergo: strings are not vectors. – 463035818_is_not_an_ai Jul 14 '15 at 15:50
  • See the answer of BoBTFish here: http://stackoverflow.com/questions/19730488/stl-is-a-string-a-vector – erenon Jul 14 '15 at 15:51
  • 1
    ps: There even might be an implementation of strings that uses a vector under the hood. However, when using such a string you will not notice, because this is an implementation detail (and thus should not leak to the public interface). – 463035818_is_not_an_ai Jul 14 '15 at 15:53
  • The small string optimization is a good reason not to make a string a vector of chars (though it could still have a vector as a member for non-small strings, I guess). I disagree about the null terminator thing. C++ strings don't have to have a null terminator and if implementing `c_str()` is a concern, you can always make sure the vector has a capacity of at least `size()+1` and that `[size()]=='\0'` and then `c_str()` the same as `begin()`. – Petr Skocik Jul 14 '15 at 15:58
  • @tobi303: None of these are compelling reasons. However, the presence of a terminating nul character is. Several duplicates: http://stackoverflow.com/q/2436762/572743 http://stackoverflow.com/questions/28418145/why-isnt-stdstring-a-specialization-of-stdvector – Damon Jul 14 '15 at 16:05
  • @Damon a `vector` subclass `string` could simply ensure it has a capacity of at least `size()+1` and that `[size()]=='\0'` and then `c_str()` is just `begin()`. – Petr Skocik Jul 14 '15 at 16:12
  • @PSkocik How would that work when you treat the `string` as its base `vector` and call `push_back`? – Mark B Jul 14 '15 at 16:14
  • @MarkB Very easily: `const char* string::c_str() { reserve(size()+1); (*this)[size()]='\0'; return begin(); }`. No override for `push_back` required. – Petr Skocik Jul 14 '15 at 16:28
  • 1
    @PSkocik `c_str` is const though so the actual string object could be const too (can't cast away constness). You could still implement `string` with a mutable vector member though. – Mark B Jul 14 '15 at 16:41
  • 1
    @PSkocik, calling `c_str()` must not invalidate existing iterators, pointers and references. If the `reserve` call needed to reallocate then it would invalidate, so your suggestion cannot be conforming. – Jonathan Wakely Jul 14 '15 at 16:56
  • Good point. It's still doable though. You'd just need make the '\0' at `*(end())` an invariant that you'd enforce in ctors and in every non-const operation. – Petr Skocik Jul 14 '15 at 17:06

3 Answers3

2

The various answers in Vector vs string show several differences in interfaces between vector and string, and since the typical pattern in the standard is to use static polymorphism rather than dynamic, they were created as two different classes.

Since strings do have different characteristics from vectors it doesn't seem that you would want to use public inheritance but I don't think there's anything in the standard that would prohibit protected or private inheritance, or composition to provide the underlying space management.

Additionally I suspect that string may have been developed earlier and orthogonally to vector which likely explains why there are member methods that more likely would have been made free algorithms if developed in parallel to vector.

Community
  • 1
  • 1
Mark B
  • 95,107
  • 10
  • 109
  • 188
2

It's pretty much just historical. Strings and vectors were developed in parallel with little thought going to how they could be considered one and the same, for T==char.

That's also why standard containers are nice and generic, whereas std::basic_string is a total monolith of member function after member function.

Edge case optimisation opportunities since made it difficult or impossible to transform std::basic_string<T, Alloc> into std::vector<T, Alloc> in any sort of standard way. Take the small string optimisation, for example. Although, now that GCC's copy-on-write mechanism is officially dead, we're a little closer.

The ability to legally dereference std::string::end() (and obtain '\0' for your trouble) is still problematic, though. A bunch of fairly stringent iterator invalidation rules for .c_str() basically prevent us from using std::vector<char> for this right from the start.

tl;dr: this is what happens when you create a camel

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • @JonathanWakely: I thought SSO was non-compliant (especially so from C++11 with those standard ambiguities & defects resolved) and that even GCC was migrating to a compliant `std::string` implementation now/soon? Fully admit I haven't bothered with much detailed research for this particular question as it's meant to give only a general idea. Still, I would of course remove factual inaccuracies... – Lightness Races in Orbit Jul 14 '15 at 16:52
  • This is news to me. COW is dead, and GCC 5.1 moved to SSO, is that what you're thinking of? – Jonathan Wakely Jul 14 '15 at 16:53
  • "Although, now that GCC's copy-on-write mechanism is officially dead, we're a little closer." Not really, while this particular obstacle has been removed, there is a new one: The committee took care to allow small string optimization for basic_string in C++11, but small vector optimization is/remains forbidden by virtue of mandating that pointers and references shall remain valid across a move or swap, and even mandating that elements must not be moved, copied or swapped (in the general container requirements, see §23.2.1/8 in N3376). – Arne Vogel Jul 14 '15 at 18:31
  • @ArneVogel: One step forwards, one step backwards, was the idea I was trying to put forth. – Lightness Races in Orbit Jul 14 '15 at 18:33
  • A camel? Why a camel? – Barry Jul 14 '15 at 19:12
  • 1
    @Barry: Because a camel is a horse designed by committee. – Lightness Races in Orbit Jul 14 '15 at 19:16
  • @LightnessRacesinOrbit Never heard that before. Love it. – Barry Jul 14 '15 at 19:20
0

Yes because vector is a container which can contain T and std::string is a wrapper around char* and can't contain int or other datatypes. Wouldn't make any sense to have it any other way.