Is accessing the elements of a char* or std::string faster?

Question

I have seen char* vs std::string in c++, but am still wondering if accessing the elements of a char* is faster than std::string.

If you need to know, the char*/std::string will contain less than 80 characters, but I would like to know a cutoff if there is one.

I would also like to know the answer to this question for different compilers and different Operating Systems, if there is a difference.

Thanks in advance!

Edit: I would be accessing the elements using array[n], and would set the values once.

(Note: If this doesn't meet the help center, please let me know how I can reword it before down-voting)

The only way to know if to measure it in the scenario that you're thinking of. With optimizations switched on there should be no difference in accessing elements. — juanchopanza, Dec 24 '15 at 17:01
It's a little more complex than that, in that in *some* situations a char* will address compile-time constant data and the indexing can be performed at compile time, whereas any dynamic allocation done when strings are constructed precludes that, though short strings may be in a SSO intra-object buffer and optimised similarly. — user433534, Dec 24 '15 at 17:11
You did not specify: what kind of access? Are you only reading those characters? You will pay extra for write access to std::string, and even more - for insert (with possible re-allocation). — Vlad Feinstein, Dec 24 '15 at 17:13
A `char*` does not contain any characters. It's a pointer. Go to the whiteboard and write 100 times "an array is not a pointer". — Pete Becker, Dec 24 '15 at 17:15
@PeteBecker - was your comment addressed to me? I, obviously, meant "those" characters accessed via `char*` or contained within `std::string`. How did you take it? May be you should write something smart on your board? :) — Vlad Feinstein, Dec 24 '15 at 20:48
@VladFeinstein - your name doesn't appear in it, so it wasn't addressed to you. — Pete Becker, Dec 24 '15 at 22:35

Cornstalks · Accepted Answer · 2015-12-25T22:34:06.687

They should be equivalent in general, though std::string might be a tiny bit slower. Why? Because of short-string optimization.

Short-string optimization is a trick some implementations use to store short strings in std::string without allocating any memory. Usually this is done by doing something like this (though different variations exist):

union {
    char* data_ptr;
    char short_string[sizeof(char*)];
};

Then std::string can use the short_string array to store the data, but only if the size of the string is short enough to fit in there. If not, then it will need to allocate memory and use data_ptr to store that pointer.

Depending on how short-string optimization is implemented, whenever you access data in a std::string, it needs to check its length and determine if it's using the short_string or the data_ptr. This check is not totally free: it takes at least a couple instructions and might cause some branch misprediction or inhibit prefetching in the CPU.

libc++ uses short-string optimization kinda like this that requires checking whether the string is short vs long every access.

libstdc++ uses short-string optimization, but they implement it slightly differently and actually avoid any extra access costs. Their union is between a short_string array and an allocated_capacity integer, which means their data_ptr can always point to the real data (whether it's in short_string or in an allocated buffer), so there aren't any extra steps needed when accessing it.

If std::string doesn't use short-string optimization (or if it's implemented like in libstdc++), then it should be the same as using a char*. I disagree with black's statement that there is an extra level of indirection in this situation. The compiler should be able to inline operator[] and it should be the same as directly accessing the internal data pointer in the std::string.

Yes but you need to _ask_ the compiler to perform optimizations, otherwise you'll effectively call `operator[]` or `.data`. Implementations may decide to apply the as-if rule, but they are not required to. — edmz, Dec 24 '15 at 17:21
@black: sure, but given the premise and context of the question, I think it's safe to assume at least a basic level of optimization is enabled in the compiler. — Cornstalks, Dec 24 '15 at 17:26
It depends on the particular SSO implementation. libstdc++'s new `basic_string`, for instance, always stores a pointer that points to either the external data or the internal SSO buffer, and so doesn't need to check for indexing. — T.C., Dec 25 '15 at 03:52
@T.C.: Very interesting! It looks like they also use a slightly larger array so they can store longer short strings (at the cost of a few extra bytes in object size). — Cornstalks, Dec 25 '15 at 22:21

score 3 · Answer 2 · answered Dec 24 '15 at 17:07

Since you don't have direct access to the underlying CharT sequence, accessing it will require an extra layer through the public interface. So it could be slower, probably requiring 20-30 cycles more. Even then, only in a tight loop you might see a difference.

However, it's extremely easy to optimize this out considering the large range of techniques a compiler can employ (caching, inlining, non-standard function calls and so on) if you instruct it to.

Is accessing the elements of a char* or std::string faster?

2 Answers2