88

When should I use std::string and when should I use char* to manage arrays of chars in C++?

It seems you should use char* if performance(speed) is crucial and you're willing to accept some of a risky business because of the memory management.

Are there other scenarios to consider?

jww
  • 97,681
  • 90
  • 411
  • 885

12 Answers12

61

You can pass std::strings by reference if they are large to avoid copying, or a pointer to the instance, so I don't see any real advantage using char pointers.

I use std::string/wstring for more or less everything that is actual text. char * is useful for other types of data though and you can be sure it gets deallocated like it should. Otherwise std::vector<char> is the way to go.

There are probably exceptions to all of this.

Vishnu CS
  • 748
  • 1
  • 10
  • 24
Skurmedel
  • 21,515
  • 5
  • 53
  • 66
  • 8
    is there a performance difference before the two? – vtd-xml-author Sep 11 '10 at 01:55
  • 3
    @vtd-xml-author: Some maybe. Straight `char *` has almost no overhead. Exactly what overhead `std::string` has I don't know, it's likely to be implementation dependant. I hardly expect the overhead to be much greater than that of a bare char pointer. Since I don't own a copy of the standard I can't really detail any guarantees made by the standard. Any performance difference will likely vary depending on the operations to be made. `std::string::size` could store the size next to the character data and thus be quicker than `strlen`. – Skurmedel Sep 11 '10 at 02:08
  • 2
    Why not use std::string for non-text data? They're not null terminated, so you should be able to store anything you want in there. – Casey Rodarmor Mar 13 '12 at 00:01
  • 1
    @rodarmor You _can_ store anything you want, though it's a touch risky as string is designed for null-terminated character strings. You must be careful to use only binary-safe operations, e.g. `append(const string&)` and `append(const char*, size_t)` instead of `operator+=()`. – boycy Mar 13 '12 at 11:53
  • 6
    Are you sure? I know that many operations will assume that a char* is a null terminated string, but I can't think of any that assume that a std::string contains no nulls. – Casey Rodarmor Mar 14 '12 at 00:00
  • Shouldn't we really be comparing (char *) to std::string::iterator? (char *) vs. std::string is the same argument as ranges vs. containers, with a twist. – Samuel Danielson Mar 09 '16 at 05:56
61

My point of view is:

  • Never use char * if you don't call "C" code.
  • Always use std::string: It's easier, it's more friendly, it's optimized, it's standard, it will prevent you from having bugs, it's been checked and proven to work.
Gal Goldman
  • 8,641
  • 11
  • 45
  • 45
14

Raw string usage

Yes, sometimes you really can do this. When using const char *, char arrays allocated on the stack and string literals you can do it in such a way there is no memory allocation at all.

Writing such code requires often more thinking and care than using string or vector, but with a proper techniques it can be done. With proper techniques the code can be safe, but you always need to make sure when copying into char [] you either have some guarantees on the lenght of the string being copied, or you check and handle oversized strings gracefully. Not doing so is what gave the strcpy family of functions the reputation of being unsafe.

How templates can help writing safe char buffers

As for char [] buffers safety, templates can help, as they can create an encapsulation for handling the buffer size for you. Templates like this are implemented e.g. by Microsoft to provide safe replacements for strcpy. The example here is extracted from my own code, the real code has a lot more methods, but this should be enough to convey the basic idea:

template <int Size>
class BString
{
  char _data[Size];

  public:
  BString()
  {
    _data[0]=0;
    // note: last character will always stay zero
    // if not, overflow occurred
    // all constructors should contain last element initialization
    // so that it can be verified during destruction
    _data[Size-1]=0;
  }
  const BString &operator = (const char *src)
  {
    strncpy(_data,src,Size-1);
    return *this;
  }

  operator const char *() const {return _data;}
};

//! overloads that make conversion of C code easier 
template <int Size>
inline const BString<Size> & strcpy(BString<Size> &dst, const char *src)
{
  return dst = src;
}
Suma
  • 33,181
  • 16
  • 123
  • 191
  • 1
    +1 for "When using const char *, char arrays allocated on the stack and string literals you can do it in such a way there is no memory allocation at all." People forget that stack "allocation" is much much faster than heap. – NoSenseEtAl Oct 12 '11 at 12:02
  • `char*` strings aren't always on the stack. `char *str = (char*)malloc(1024); str[1024] = 0;` – Cole Tobin Apr 26 '13 at 19:27
  • @ColeJohnson I am not claiming that, I just mean that if you want your string to be stack allocated, you need to use const char * in conjunction with string literals, not std::string. – Suma Apr 27 '13 at 22:30
9

One occasion that you MUST use char* and not std::string is when you need static string constants. The reason for that is that you don't have any control on the order modules initialize their static variables, and another global object from a different module may refer to your string before it's initialized. http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Static_and_Global_Variables

std::string pros:

  • manages the memory for you (the string can grow, and the implementation will allocate a larger buffer you)
  • Higher-level programming interface, works nicely with the rest of STL.

std::string cons: - two distinct STL string instances can not share the same underlying buffer. So if you pass by value you always get a new copy. - there is some performance penalty, but I'd say unless your requirements are special it's negligible.

HugoTeixeira
  • 4,674
  • 3
  • 22
  • 32
thesamet
  • 6,382
  • 2
  • 31
  • 42
  • Actually, STL implementations often implement copy-on-write semantics for std::string, so passing them by value doesn't cost very much at all. Still, it's better not to rely on that, and generally better to pass a reference-to-const anyway. –  Apr 29 '09 at 08:06
  • 1
    Some std::string implementations gave up on COW implementation. Moreover it is not as trivial as it seems to provide a (POSIX) thread safe class compatible with the standard. See http://groups.google.fr/group/ifi.test.boa/browse_frm/thread/cb16ed54c3e78a78/215edbc9c7686fdd or http://groups.google.fr/group/comp.programming.threads/browse_frm/thread/dbdf76a8844bde5c/d8651dd45d13b862 – Luc Hermitte Apr 29 '09 at 12:27
8

You should consider to use char* in the following cases:

  • This array will be passed in parameter.
  • You know in advance the maximum size of your array (you know it OR you impose it).
  • You will not do any transformation on this array.

Actually, in C++, char* are often use for fixed small word, as options, file name, etc...

Jérôme
  • 2,640
  • 3
  • 26
  • 39
5

When to use a c++ std::string:

  • strings, overall, are more secure than char*, Normally when you are doing things with char* you have to check things to make sure things are right, in the string class all this is done for you.
  • Usually when using char*, you will have to free the memory you allocated, you don't have to do that with string since it will free its internal buffer when destructed.
  • strings work well with c++ stringstream, formatted IO is very easy.

When to use char*

  • Using char* gives you more control over what is happening "behind" the scenes, which means you can tune the performance if you need to.
Vishnu CS
  • 748
  • 1
  • 10
  • 24
user88637
  • 11,790
  • 9
  • 37
  • 36
5

Use (const) char* as parameters if you are writing a library. std::string implementations differ between different compilers.

Nemanja Trifunovic
  • 24,346
  • 3
  • 50
  • 88
  • If you're writing a library in C++, the layout of std::string isn't the only thing you have to worry about. There's hosts of potential incompatibilities between two implementations. Use libraries in C++ only if available in source or compiled for the exact compiler you're using. C libraries are typically more portable, but in that case you don't have std::string anyway. – David Thornley Apr 29 '09 at 15:14
  • True that std::string is not the only problem, but it is a bit too much to conclude "Use libraries in C++ only if available in source or compiled for the exact compiler you're using." There are component systems that work fine with different compilers (COM, for instance) and it is possible to expose a C interface to a library that is internally written with C++ (Win32 API, for instance) – Nemanja Trifunovic Apr 29 '09 at 17:11
3

If you want to use C libraries, you'll have to deal with C-strings. Same applies if you want to expose your API to C.

n0rd
  • 11,850
  • 5
  • 35
  • 56
2

You can expect most operations on a std::string (such as e.g. find) to be as optimized as possible, so they're likely to perform at least as well as a pure C counterpart.

It's also worth noting that std::string iterators quite often map to pointers into the underlying char array. So any algorithm you devise on top of iterators is essentially identical to the same algorithm on top of char * in terms of performance.

Things to watch out for are e.g. operator[] - most STL implementations do not perform bounds checking, and should translate this to the same operation on the underlying character array. AFAIK STLPort can optionally perform bounds checking, at which point this operator would be a little bit slower.

So what does using std::string gain you? It absolves you from manual memory management; resizing the array becomes easier, and you generally have to think less about freeing memory.

If you're worried about performance when resizing a string, there's a reserve function that you may find useful.

1

if you are using the array of chars in like text etc. use std::string more flexible and easier to use. If you use it for something else like data storage? use arrays (prefer vectors)

RvdK
  • 19,580
  • 4
  • 64
  • 107
1

Even when performance is crucial you better use vector<char> - it allows memory allocation in advance (reserve() method) and will help you avoid memory leaks. Using vector::operator[] leads to an overhead, but you can always extract the address of the buffer and index it exactly like if it was a char*.

sharptooth
  • 167,383
  • 100
  • 513
  • 979
  • But it would be nice to use some kind of typical string functionality, and have just the option to specify the policy for the storage. For that see the link in my answer. – Anonymous Apr 29 '09 at 07:18
  • It is not reaaly true. If you consider that vector will be allocated in contiguous memory space, reallocation (to increase the vector size) will not be efficient at all, as it implies the copy of the previous chunk. – Jérôme Apr 29 '09 at 07:29
  • I missunderstood your response, as you use the vector instead of char*, not instead of string... In this case I agree. – Jérôme Apr 29 '09 at 07:33
  • There shouldn't be an overhead in operator[] usage. See for instance, http://stackoverflow.com/questions/381621/using-arrays-or-stdvectors-in-c-whats-the-performance-gap – Luc Hermitte Apr 29 '09 at 12:00
-1

AFAIK internally most std::string implement copy on write, reference counted semantics to avoid overhead, even if strings are not passed by reference.

piotr
  • 5,657
  • 1
  • 35
  • 60
  • 5
    This is no longer true, because copy on write causes serious scalability issues in multithreaded environment. – Suma Apr 29 '09 at 08:22
  • It's true for GCC's implementation of the STL at least. –  Apr 29 '09 at 09:10