How to uppercase match in C++ regex_replace?

Question

I am looking for a simple way to turn C++ strings with underscores to camelCase, i.e.: my_simple_humble_string to mySimpleHumbleString

Easy in Perl. I prefer not to use boost.

You can use `for loop`, check if a letter is an underscore: if yes delete the underscore and swap to uppercase (dont forget to change the length of the string -1), otherwise continue to next char. — Ami Hollander, Jul 06 '16 at 20:13

Aconcagua · Answer 1 · 2016-07-07T09:40:00.397

According to this site, it is not supported. Could not find any other hints contradicting...

On the other hand, it is not too difficult to do it by hand:

std::string camelCase(std::string const& input)
{
    std::string s;
    s.reserve(input.length());
    bool isMakeUpper = false;
    for(char c : input)
    {
        if(c == '_')
        {
            isMakeUpper = true;
        }
        else if(isMakeUpper)
        {
            s += (char)toupper(c);
            isMakeUpper = false;
        }
        else
        {
            s += c;
        }
    }
    return s;
}

Edit: in-place variant:

void camelCase(char* input)
{
    bool isMakeUpper = false;
    char* pos = input;
    for(char* c = input; *c; ++c)
    {
        if(*c == '_')
        {
            isMakeUpper = true;
        }
        else if(isMakeUpper)
        {
            *pos++ = toupper(*c);
            isMakeUpper = false;
        }
        else
        {
            *pos++ = *c;
        }
    }
    *pos = 0;
}

Edit 2: in-place variant for strings:

void camelCase(std::string& input)
{
    bool isMakeUpper = false;
    std::string::iterator pos = input.begin();
    for(char c : input)
    {
        if(c == '_')
        {
            isMakeUpper = true;
        }
        else if(isMakeUpper)
        {
            *pos++ = (char)toupper(c);
            isMakeUpper = false;
        }
        else
        {
            *pos++ = c;
        }
    }
    input.resize(pos - input.begin());
}

Upvoted, but in the first solution you should don't need the `std::stringstream` - it's only there to slow down your code. An `std::string` and `+=` instead of `<<` will do. — Matteo Italia, Jul 06 '16 at 21:05
@MatteoItalia Updated - including reserving appropriate buffer size in advance... — Aconcagua, Jul 07 '16 at 09:35

score -1 · Answer 2 · edited Apr 12 '17 at 07:31

-1

Regexes should be used sparingly, see "Now you Have 2 Problems

This is a good example of a case where they are not needed. Given auto foo = "my_simple_humble_string"s we can do:

auto count = 0;

for (auto read = 1; read < size(foo); ++read) {
    if (foo[read] == '_') {
        ++count;
        ++read;
        foo[read - count] = toupper(static_cast<unsigned char>(foo[read]));
    } else {
        foo[read - count] = foo[read];
    }
}
foo[size(foo) - count] = foo[size(foo)];
foo.resize(size(foo) - count);

Live Example

A couple notes on the algorithm:

I will read from foo[size(foo)] which is undefined behavior prior to C++11: http://en.cppreference.com/w/cpp/string/basic_string/operator_at
I will not replace opening or closing underscores
I resize the string to match the length of the camel case string after converting which will invalidate iterators.

edited Apr 12 '17 at 07:31

Community

1
1

answered Jul 06 '16 at 20:21

Jonathan Mee

37,899
23
129
288

3

You probably meant something like `i = foo.find('_', i)` instead of `++i` (and initialization and beware that `std::string::npos + 1` overflows). – Zereges Jul 06 '16 at 20:35
Even with the correction from Zerges, this solution is O(n*m) (with m being the number of underscores) for no reason, this is a transformation that can be easily done inplace in a single pass. – Matteo Italia Jul 06 '16 at 20:44
@Zereges Ugh, I didn't type the `if`. Note the range on my loop, I won't ever remove a leading or trailing underscore. – Jonathan Mee Jul 06 '16 at 20:52
@MatteoItalia I would have thought this would run in *O(n)* time. What are you seeing that I don't? – Jonathan Mee Jul 06 '16 at 20:52
2

Isn't `.erase` O(n)? – melpomene Jul 06 '16 at 20:55
@melpomene Now that I look at it, I do believe that [you are correct](http://www.cplusplus.com/reference/string/basic_string/erase/#complexity). We could certainly do this by copying... not my first inclination but I suppose it's doable. – Jonathan Mee Jul 06 '16 at 21:01
@JonathanMee: yep, `.erase` - almost any time you have an `erase` of a vector or a string in a loop (unless it's at the end) it can be written more efficiently. As almost always happen, this can be written via the usual read pointer/write pointer loop (see @Aconcagua's second answer). – Matteo Italia Jul 06 '16 at 21:03
1

Now you have `std::toupper(static_cast(foo[i]))` problems. – melpomene Jul 06 '16 at 21:04
Now you are never incrementing `write`. – Matteo Italia Jul 06 '16 at 21:10
@MatteoItalia Ugh well that was just one debacle after another. Thanks for the help. – Jonathan Mee Jul 06 '16 at 21:20
@JonathanMee: still broken `asd_def_ghi` => `ad_Df_Gi`. Also, how you set it up it wouldn't handle consecutive underscores. – Matteo Italia Jul 06 '16 at 21:22
@melpomene I don't think that was a problem. It was intentional: http://stackoverflow.com/q/21805674/2642059 But then I keep making mistakes on this answer... so who knows if I did it right. – Jonathan Mee Jul 06 '16 at 21:22
@MatteoItalia Whew, now I know why Aconcagua switched to a `char*` for replacement. That was more complicated than I expected. – Jonathan Mee Jul 06 '16 at 21:55
@JonathanMee: not really, you can use indexes or pointers, it's mostly the same. You just have to implement the basic algorithm correctly (which is mostly the same that is implemented by `std::remove`, for example). – Matteo Italia Jul 06 '16 at 22:13

How to uppercase match in C++ regex_replace?

2 Answers2