Going ones through string to count length takes longer time, than moving string a couple of times?

Question

I wrote the following two functions. In the second function, I used reserve() so that there is no memory reallocation, but unfortunately the second function is slower than the first.

I used release mode and this CPU profiler in Visual Studio to count time. In the second function, reallocation takes place 33 times. So my question is: Really? Going one length string to count length takes longer time, than moving this string 33 times?

string commpres2(string str)
{
    string strOut;
    int count = 0;
    for (int i = 0; i < str.length(); ++i)
    {
        ++count;
        if (i < str.length() - 1)
        {
            if (str[i + 1] != str[i])
            {
                strOut += str[i];
                strOut += to_string(count);
                count = 0;
            }
        }
        else
        {
            strOut += str[i] + to_string(count);
        }
    }
    return strOut.length() < str.length() ? strOut : str;
}

string commpres3(string str)
{
    int compressedLength = 0;
    int countConsecutive = 0;
    for (int i = 0; i < str.length(); ++i)
    {
        ++countConsecutive;
        if (i + 1 >= str.length() || str[i] != str[i + 1]) 
        {
            compressedLength += 1 + 
                to_string(countConsecutive).length();
            countConsecutive = 0;
        }
    }
    if (compressedLength >= str.length())
        return str;
    string strOut;
    strOut.reserve(compressedLength);
    int count = 0;
    for (int i = 0; i < str.length(); ++i)
    {
        ++count;
        if (i < str.length() - 1)
        {
            if (str[i + 1] != str[i])
            {
                strOut += str[i];
                strOut += to_string(count);
                count = 0;
            }
        }
        else
        {
            strOut += str[i] + to_string(count);
        }
    }
    return strOut;
}

int main()
{
    string str = "aabcccccaaa";

    //str.size ~ 11000000;
    for (int i = 0; i < 20; ++i)
        str += str;
    commpres2(str); //107ms //30,32% CPU
    commpres3(str); //147ms //42,58% CPU
}

Probably need to replace `to_string(countConsecutive).length()` with something that calculates the digits in the number mathematically. The amount of work you are doing to calculate the string length probably outweighs the benefits of reserve — Alan Birtles, Feb 12 '20 at 07:16
Besides, your algorithm cannot be used. For instance, how can you decompress "a5"? The original string can be either "a5" or "aaaaa". Or you are sure that the original strings never contain digits? — PooSH, Feb 12 '20 at 07:56

Remy Lebeau · Accepted Answer · 2020-02-12T22:20:29.320

The 2nd function is doing more work than the 1st function, so of course it is going to take longer. Profiling the code should have shown you exactly where the code is spending its time. For instance, the 1st function loops through the str at most 1 time, but the 2nd function may loop through the same str 2 times, which by definition takes longer.

And you haven't eliminated all memory allocations from the 2nd function, either. to_string() allocates memory, and you are calling it many times before and after calling reserve(). Eliminating all of the to_string() allocations is fairly simple, using std::snprintf() into a local buffer and then std::string::append() to add that buffer to your output std::string.

You could forgo all of the pre-calculating and just reserve() the full str length even if you don't end up using all of that memory. You are not going to use up more than the original str length in the worse case scenario (no compression possible at all):

inline int to_buffer(size_t number, char *buf, size_t bufsize)
{
    return snprintf(buf, bufsize, "%zu", number);
}

string commpres3(const string &str)
{
    string::size_type strLen = str.length();
    string strOut;
    strOut.reserve(strLen);
    size_t count = 0;
    char buf[25];
    for (string::size_type i = 0; i < strLen; ++i)
    {
        ++count;
        if (i < strLen - 1)
        {
            if (str[i + 1] != str[i])
            {
                strOut += str[i];
                strOut.append(buf, to_buffer(count, buf, sizeof(buf)));
                count = 0;
            }
        }
        else
        {
            strOut += str[i];
            strOut.append(buf, to_buffer(count, buf, sizeof(buf)));
        }
        if (strOut.length() >= strLen)
            return str;
    }
    return strOut;
}

Or, if you must pre-calculate, you can replace the 1st set of to_string() calls with something else that returns the needed length without allocating memory dynamically (see this for ideas). When calculating the size to reserve, you don't need to actually convert an integer 123 to an allocated string "123" to know that it would take up 3 chars.

inline int to_buffer(size_t number, char *buf, size_t bufsize)
{
    return snprintf(buf, bufsize, "%zu", number);
}

inline int to_buffer_length(size_t number)
{
    return to_buffer(number, nullptr, 0);
}

string commpres3(const string &str)
{
    string::size_type strLen = str.length();
    string::size_type compressedLength = 0;
    size_t countConsecutive = 0;
    for (string::size_type i = 0; i < strLen; ++i)
    {
        ++countConsecutive;
        if (i < (strLen - 1))
        {
            if (str[i + 1] != str[i])
            {
                strOut += 1 + to_buffer_length(countConsecutive);
                countConsecutive = 0;
            }
        }
        else
        {
            strOut += 1 + to_buffer_length(countConsecutive);
        }
    }
    if (compressedLength >= strLen)
        return str;
    string strOut;
    strOut.reserve(compressedLength);
    size_t count = 0;
    char buf[25];
    for (string::size_type i = 0; i < strLen; ++i)
    {
        ++count;
        if (i < strLen - 1)
        {
            if (str[i + 1] != str[i])
            {
                strOut += str[i];
                strOut.append(buf, to_buffer(count, buf, sizeof(buf)));
                count = 0;
            }
        }
        else
        {
            strOut += str[i];
            strOut.append(buf, to_buffer(count, buf, sizeof(buf)));
        }
    }
    return strOut;
}

`std::to_string(int)` doesn't allocate memory if the string implementation uses SSO (short string optimization). — PooSH, Feb 12 '20 at 08:12
@PooSH perhaps, but that is not guaranteed in every implementation. — Remy Lebeau, Feb 12 '20 at 11:17
I tried to implement it in turns. First, I changed only one line in my code: `compressedLength += 1 + to_string(countConsecutive).length();` to: `compressedLength += 1 + snprintf(nullptr, 0, "%zu", countConsecutive);` But then I got the worst result: 479ms. — wojtek, Feb 13 '20 at 07:26
@wojtek that was just an example. There are other ways to calculate the char length for an integer. See [this link](https://stackoverflow.com/q/1489830/65863) — Remy Lebeau, Feb 13 '20 at 15:41

PooSH · Answer 2 · 2020-02-12T11:45:42.063

33 memory allocations vs ~11000000 extra if statements.

You are doing if (i < str.length() - 1) check in every iteration but you need to do it only once.

Consider the following:

   if (str.empty()) return str;
   const auto last = str.length() - 1;
   for (size_t i = 0; i < last; ++i)
   {
       ++count;
       if (str[i + 1] != str[i])
       {
           strOut += str[i];
           strOut += to_string(count);
           count = 0;
       }
   }
   strOut += str[last] + to_string(count);

Some optimization hints:

You can avoid adding count if it equals to one. Otherwise, your algorithm "compresses" "abc" to "a1b1c1".
Add an indicator that the following byte is a count not a regular character to distinguish between "a5" and "aaaaa". For instance, use 0xFF. Hence, "a5" gets encoded to "a5", but "aaaaa" -> {'a', 0xFF, 5}
Store count in binary form, not ASCII. For instance, you can write 3 (0x03) instead of '3' (0x33). You can use one byte to store count up to 255.

constexpr char COMPRESS_COUNT_SEPARATOR = 0xFF;

string compress(const string &str)
{
    string strOut;
    if (str.empty()) return strOut;

    unsigned char count = 0;
    const auto last = str.length() - 1;
    for (size_t i = 0; i < last; ++i)
    {
        ++count;
        if (str[i + 1] != str[i] || count == 255)
        {
            strOut += str[i];
            if (count > 1) {
               strOut += COMPRESS_COUNT_SEPARATOR;
               strOut += static_cast<char>(count);
            }
            count = 0;
        }
    }
    strOut += str[last];
    if (count) {
        strOut += COMPRESS_COUNT_SEPARATOR;
        strOut += static_cast<char>(count+1);
    }

    return strOut;
}

Or you can even use 0x00 as COMPRESS_COUNT_SEPARATOR because C-strings cannot contain null terminators but std::string can.

Ooops, my suggested algorithm doesn't check for `count == 2` case when two characters expand to three. But I leave it for you as homework :) — PooSH, Feb 12 '20 at 09:01

Going ones through string to count length takes longer time, than moving string a couple of times?

2 Answers2