0

The question is simple: What is the quickest way of finding the position of the null-terminator ('\0') in a string buffer?

std::array<char, 100> str { "A sample string literal" };

I guess one option is to use the std::strlen.
Another option that I can think of is std::find or maybe even std::ranges::find:

const auto posOfNull { std::find( str.cbegin( ), str.cend( ), '\0' ) };

Now would it make a difference if an ExecutionPolicy (e.g. std::execution::par) was passed to it as its first argument? If it would, then which policy is the appropriate one for this particular case?

Or maybe a 3rd option that I don't know about?

digito_evo
  • 3,216
  • 2
  • 14
  • 42
  • 4
    It is likely strlen will be the fastest, considering some implementations are vectorized: https://stackoverflow.com/questions/25566302/vectorized-strlen-getting-away-with-reading-unallocated-memory That being said, the fastest way is to track the length separately, if you don't mind sacrificing few bytes. – freakish Apr 27 '22 at 08:12
  • @康桓瑋 Yes, I'll do. Also what about policies in case I want to try one of those with `std::find`? Which one is suitable? – digito_evo Apr 27 '22 at 08:18
  • @freakish *"track the length separately"*, I don't get it. How exactly? I would also want to mention that the above code is just a sample. In my program, it's actually the user that enters the string as the input. I remember I considered using `std::cin.gcount()` but that didn't go as I expected. – digito_evo Apr 27 '22 at 08:22
  • @digito_evo by using std::string instead of std::array. In case it comes from cin then yes, it will evaluate the length once, but first: in the background, secondly, you should not worry about performance of that operation, this is blazingly fast. And most likely premature optimization. – freakish Apr 27 '22 at 08:24
  • @freakish Oh. But it would be slow?! since you know... *dynamic allocations*. I chose `std::array` to reduce the number of allocations down to zero in some of the functions and only rely on stack. And the program runs super fast. I just want to make it better a bit without sacrificing code readability cause I see opportunities. Is there a way to force `std::string` to use a stack-based buffer or even better, a buffer with static storage duration? – digito_evo Apr 27 '22 at 08:32
  • @digito_evo it is possible to provide your own allocator to std::basic_string to obtain stack-based std::string. Still, I'm not sure why you are worried about this performance? You said this is an input from user via std::cin. And your code suggests you expect at most 100 chars. You won't notice a difference between any of those methods in terms of performance, but std::string is the most convenient. – freakish Apr 27 '22 at 08:36
  • "track the length separately" to me means finding the NULL when first creating the string, and storing that. Instead of doing `strlen` every time you need it. Whether this makes sense depends on how mutable the string is and how often you need to find NULLs. `strlen` is an *extremely* fast operation so multithreading would be out of the question unless you're dealing with hundreds(maybe thousands?) of very long (1mb+) strings. Synchronizing threads introduces immense overhead compared to `strlen`. – tenfour Apr 27 '22 at 08:46
  • @freakish How about `pmr::string`? Can it do the same thing with less headache (i.e. fewer lines of code)? – digito_evo Apr 27 '22 at 08:47
  • @digito_evo sorry, I'm not sure what that is. Sounds like boost's string, but I have 0 experience with boost. – freakish Apr 27 '22 at 08:48
  • @freakish `std::pmr::string`, a class that uses polymorphic allocators. – digito_evo Apr 27 '22 at 08:50
  • @tenfour *"finding the NULL when first creating the string, and storing that"* Yeah this is the most ideal solution. No need for searching, etc. – digito_evo Apr 27 '22 at 08:51
  • @digito_evo oh, wow, I was not aware there is such thing in std. I don't know how this polymorphic_allocator works, sorry. – freakish Apr 27 '22 at 08:53
  • The string literal known at compile time is just a simplified example of your general need, right? – Daniel Daranas Apr 27 '22 at 09:58
  • If you're managing strings you should definitely use std::string, not std::array. You're micro optimizing and your code will look uglier with that. Use an array of chars when you're interested in sorting any array of chars, not just strings. – Daniel Daranas Apr 27 '22 at 10:01
  • @Daniel Daranas Yes it's a simplified version. Suppose that the `str` buffer is filled using `std::cin.getline(str.data(), str.size());` – digito_evo Apr 27 '22 at 13:23

1 Answers1

5

What is the quickest way of finding the position of the null-terminator ('\0') in a string buffer?

I'm going to assume that you mean "to find the first null terminator".

Fastest depends on details. You have to measure to find out. std::strlen is a fairly safe default choice.

Now would it make a difference if an ExecutionPolicy (e.g. std::execution::par

The length of your buffer is 100. The overhead from multi threading may easily exceed the the time to find the terminator in a single thread.

If it would, then which policy is the appropriate one for this particular case?

The one that you have measured to be the fastest. None of them would break the correctness of the loop.

Or maybe a 3rd option that I don't know about?

You can use std::char_traits::length. Its benefits over std::strlen are that it works with all character types (compared to only char), and it is constexpr which allows using it in constant expressions.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • *"The overhead from multi threading may easily exceed the time to find the terminator in a single thread."* Good point. How about 200 chars? I'll have to benchmark. – digito_evo Apr 27 '22 at 08:38
  • 2
    @digito_evo `How about 200 chars? I'll have to benchmark.` You've answered your own question :) – eerorika Apr 27 '22 at 08:39
  • `std::strlen` assumes there is a `'\0'` to find, `strnlen` would be safer when looking into a buffer. – Jarod42 Apr 27 '22 at 09:04
  • @Jarod42 In cases where the existence of null terminator is unknown, you mustn't use `std::strlen`. Interestingly, there's no corresponding char_traits function for `std::strnlen`. – eerorika Apr 27 '22 at 09:32
  • So I tried all of these functions and the results were the same. Even the size of the binary didn't change. Apparently, they all generate the same code in my program. And so I think the best one is `std::char_traits::length` which is both readable and flexible. I will however try to solve this issue by using `gcount()` and retrieving the length of the string at the time of its creation so that no searching operation is required to find the end of the string. I guess this can speed up the code. – digito_evo Apr 27 '22 at 22:23
  • Ok, so I finally decided to use `std::char_traits::length` over `gcount()` since it was faster for some reason contrary to my own assumptions! Thanks for the suggestion. – digito_evo Apr 29 '22 at 07:48