10

I'm afraid I don't know templates (or C++, really), but I know algorithms and data structures (even some OOP! :). Anyway, to make the question a bit more precise, consider what I would like to be part of the answer (among others I don't know in advance).

  1. Why is it coded as a template?
  2. How does the template work?
  3. How does it do mem allocation?
  4. Why is (is not) better than mere null terminated char arrays?
halfer
  • 19,824
  • 17
  • 99
  • 186
Dervin Thunk
  • 19,515
  • 28
  • 127
  • 217
  • 10
    Do you have a C++ book? If you're unaware how templates work, asking why a class uses them seems to be a shot in the dark to me. – GManNickG Oct 19 '10 at 22:53
  • http://www.cplusplus.com/reference/string/ is a good start. – Tristan Oct 19 '10 at 22:55
  • Is it so hard to explain? Granted you have to know to explain this clearly... I bet someone here can do it without being patronizing. I wonder, really, what would happen if my students came to me with a question like that and I just said "go learn it yourself!" You guys... :) – Dervin Thunk Oct 19 '10 at 22:59
  • @Dervin: This is like asking why cars are made the way they are without being able to recognize what a car is, or why we have them. Your question doesn't make sense. It's a template because it needs to be generic and flexible, the template working isn't trivial enough to be learned by something other than a book, it uses allocators to allocate memory, again for flexibility, and it's safer and easier than null terminated arrays. You can't really get any more than that without *learning* what templates are, and *using* C++/ – GManNickG Oct 19 '10 at 23:05
  • 3
    @GMan. I was expecting answers like Steve's and John's below. Your comment "go find out yourself", is just evading the question, and, perhaps like my question, it doesn't mean anything... except, well, evasion. At least they give me something to start with. – Dervin Thunk Oct 19 '10 at 23:09
  • @GMan: a car is significantly (read: magnitudes) more complex than STL containers. Maybe you should read a book about cars? ;) Strings are pretty easy to get, even if you don't fully understand templates. – David Titarenco Oct 19 '10 at 23:15
  • 2
    @David Titarenco: I see what you're saying, but I'm pretty much with GMan in that the "how do templates work" question is unanswerable. – John Dibling Oct 19 '10 at 23:23
  • 2
    @Dervin Thunk - If they are "your students" its your job to teach them. Here everyone is giving of their free time. So the expectations are correspondingly different. Someone asking a question without having done a modicum of their own investigation is considered rude. Etiquette demands that you at least mention what you have found out / tried. This question mentioned nothing of the sort, thus GMans and Tristans responses seem fair. – Michael Anderson Oct 19 '10 at 23:39
  • 2
    @Dervin: be honest then. What would you answer if one of your students came to you and asked you to explain *half of a programming language*? Most likely, you *would* tell them "go read a book, and come back when you have some *specific* questions". Which is what GMan did. The problem with your question is that in order to answer it properly, knowledge of templates is required. Templates is one of the biggest and most complex parts of the C++ language, and can *not* be explained properly in less than 50 pages of text. – jalf Oct 20 '10 at 01:02
  • 2
    And without explaining how templates work, it's kind of hard to explain how a specific class template works. The question "how does `std::string` work" is easily answerable. But if you add the qualifier "by the way, I don't know any of the language features used to implement the class, so you'll have to explain that too", it becomes a pretty overwhelming question. – jalf Oct 20 '10 at 01:02

5 Answers5

19
  1. std::string is actually a typedef to a std::basic_string<char>, and therein lies the answer to your #1 above. Its a template in order to make basic_string work with pretty much anything. char, unsigned char, wchar_t, pizza, whatever... string itself is just a programmer convenience that uses char as the datatype, since that's what's often wanted.

  2. Unanswerable as asked. If you're confused about something, please try to narrow it down a bit.

  3. There are two answers. One, from the application-layer point of view, all basic_string objects use an allocator object to do the actual allocation. Allocation methods may vary from one implementation to the next, and for different template parameters, but in practice they will use new at the lower levels to allocate & manage the contained resource.

  4. Its better than mere char arrays for a wide variety of reasons.

    • string managers the memory for you. You do not have to ever allocate buffer space when you add or remove data to the string. If you add more than will fit in the currently-allocated buffer, string will reallocate it for you behind the scenes.

    • In this regard, string can be thought of as a kind of smart pointer. For the same reasons why smart pointers are better than raw pointers, string s are better than raw char arrays.

    • Type safety. This may seem a little convoluted, but string used properly has better type safety than char buffers. Consider a common scenario:

 

 #include <string>
 #include <sstream>
 using namespace std;

 int main()
 {
   const char* jamorkee_raw = "jamorkee";

   char raw_buf[0x1000] = {};
   sprintf( raw_buf, "This is my string.  Hello, %f", jamorkee_raw);  

   const string jamorkee_str = "jamorkee";
   stringstream ss;
   ss << "This is my string.  Hello " << jamorkee_str; 
   string s = ss.str();
 }

the type safety issue raised in the above by using a raw char buffer isn't even possible when using string along with streams.

Ian R. O'Brien
  • 6,682
  • 9
  • 45
  • 73
John Dibling
  • 99,718
  • 31
  • 186
  • 324
  • Something to do with code blocks after lists. :/ Here's my attempt, you don't have to keep it. – GManNickG Oct 19 '10 at 23:15
  • The stringstream analogy isnt fair. It should be sprintf into ss.c_str, resulting in equally horrible result. C can implement stringstream-like behavior as well; it isn't in the standard library but you can still implement the same behavior using carefully crafted functions or just use a library that did it for you like glib. – Dmytro Oct 12 '16 at 06:28
7

A rather quick (and therefore probably incomplete) shot at answering some of the questions:

  1. Why is it coded as a template?

Templates provide the capability for the class functions to work on arbitrary data types. For example the basic_string<> template class can work on char units (which is what the std::string typedef does) or wchar_t units (std::wstring) or any POD type. Using something other than char or wchar_t is unusual (std::vector<> would more likely be used), but the possibility exists.

  1. How does it do mem allocation?

This isn't specified by the standard. In fact, the basic_string<> template allows an arbitrary allocator to be used for the actual allocation of memory (but doesn't determine at what points allocations might be requested). Some implementations might store short strings in actual class members, and only allocate dynamically when the strings grow beyond a certain size. The size requested might be exactly what's need to store the string or might be a multiple of the size to allow for growth without a reallocation.

Additional information stolen from another SO answer:

Scott Meyer's book, Effective STL, has a chapter on std::string implementations that's a decent overview of the common variations: "Item 15: Be aware of variations in string implementations".

He talks about 4 variations:

  • several variations on a ref-counted implementation (commonly known as copy on write) - when a string object is copied unchanged, the refcount is incremented but the actual string data is not. Both object point to the same refcounted data until one of the objects modifies it, causing a 'copy on write' of the data. The variations are in where things like the refcount, locks etc are stored.

  • a "short string optimization" implementation. In this variant, the object contains the usual pointer to data, length, size of the dynamically allocated buffer, etc. But if the string is short enough, it will use that area to hold the string instead of dynamically allocating a buffer

  1. Why is (is not) better than mere null terminated char arrays?

One way the string class is better than a mere null terminated array is that the class manages the memory required, so defects involving allocation errors or overrunning the end of the allocated arrays are reduced. Another (perhaps minor) benefit is that you can store 'null' characters in the string. A drawback is that there's perhaps some overhead - especially that you pretty much have to rely on dynamic memory allocation for the string class. In most scenarios that's probably not a major issue, on some setups (embedded systems for example) it can be a problem.

Community
  • 1
  • 1
Michael Burr
  • 333,147
  • 50
  • 533
  • 760
2
  1. string is not the template, string is a specialization of the basic_string class template for char. It's a template so that for example you can typedef wstring which specializes on wide characters, and use all the same code for the encapsulated value.

  2. See @Gman's comment. Compile-time code reuse, while retaining the ability to selectively special-case, is the basic rationale for templates.

  3. Implementation dependent. Some do single-instance allocation, with copy on write. Some use a builtin buffer for small strings and allocate from heap only after a certain size is reached. I suggest you investigate how it works on your compiler by walking the constructor and follow-on code in <string>, as that will help you understand 2. hands on, which is way more valuable than just reading about it (though a book or other reading is a great idea for intro to templates).

  4. Because const char* and the CRT that supports it is a bug farm for the unwary. Check out all the stuff you get for free with std::string. Plus a whole bunch of Standard C++ algorithms that work with string iterators.

Steve Townsend
  • 53,498
  • 9
  • 91
  • 140
2

Why is it coded as a template?

Several people have given the answer that having std::basic_string be a template means that you can have both std::basic_string<char> and std::basic_string<wchar_t>. What nobody has explained is why C and C++ have multiple character types in the first place.

C, especially in its early versions, was minimalistic about data types. Why have bool when the integers 0 and 1 work just fine? And why have distinct types for "byte" and "character" when they're both 8 bits?

The problem is that 8 bits limits you to 256 characters, which is adequate for an alphabetic language like English or Russian, but nowhere near enough for Japanese or Chinese. And now we have Unicode with its 21-bit code points. But char couldn't be expanded to 16 or 32 bits because the assumption that char = byte was so entrenched. So we got a separate type for "wide characters".

But now we have the problem that wchar_t is UTF-32 on Linux but UTF-16 on Windows. And to solve that problem the next version of the C++ standard will add the char16_t and char32_t types (and corresponding string types).

dan04
  • 87,747
  • 23
  • 163
  • 198
0

A good free online resource is "Thinking in C++" by Bruce Eckel, whose site is here: http://mindview.net/Books/TICPP/ThinkingInCPP2e.html .

The second volume of his free book is mirrored here: http://www.smart2help.com/e-books/ticpp-2nd-ed-vol-two/#_ftnref14 . Chapter three is all about the string class, why it's a template, and why it's useful.

Zeke
  • 1,974
  • 18
  • 33
  • Thanks for the resource. I'm not really asking about how to use it, just how it's implemented itself at the lowest level. – Dervin Thunk Oct 19 '10 at 23:02