An algorithm to "spacify" CamelCased strings

Question

Pretty basic, I'm just curious how others might implement this algorithm and would like to see if there are any clever tricks to optimize the algorithm...I just had to implement this for a project that I am working on.

Given a string in CamelCase, how would you go about "spacifying" it?

e.g. given FooBarGork I want Foo Bar Gork back.

Here is my algorithm in C#:


static void Main(string[] args)
{
    Console.WriteLine(UnCamelCase("FooBarGork"));
}
public static string UnCamelCase(string str)
{
    StringBuilder sb = new StringBuilder();
    for (int i =  0; i < str.Length; i++)
    {
        if (char.IsUpper(str, i) && i > 0) sb.Append(" ");
        sb.Append(str[i]);
    }
    return sb.ToString();
}

Since you have to visit every character once, I believe the best case is O(n). How would you implement this?

Very similar to http://stackoverflow.com/questions/323314/best-way-to-convert-pascal-case-to-a-sentence — Garry Shutler, Jan 27 '09 at 16:46

score 18 · Answer 1 · edited May 23 '17 at 12:26

I can already sense the flames, but I like regex for this kind of stuff.

public static string UnCamelCase(string str)
{
    return Regex.Replace(str, "([a-z])([A-Z])", "$1 $2");
}

(This may not be faster than your implementation, but to me it is more clear.)

And obviously, this would be even faster (at runtime)

private static Regex _unCamelRegex = new Regex("([a-z])([A-Z])", RegexOptions.Compiled);

public static string UnCamelCase(string str)
{
    return _unCamelRegex.Replace(str, "$1 $2");
}

This would handle the issue brought up by Pete Kirkham below (as far as camel-cased strings like HTTPRequest):

private static Regex _unCamelRegex1 = new Regex("([a-z])([A-Z])", RegexOptions.Compiled);
private static Regex _unCamelRegex2 = new Regex("([A-Z]+)([A-Z])([a-z])", RegexOptions.Compiled);

public static string UnCamelCase(string str)
{
    return _unCamelRegex2.Replace(_unCamelRegex1.Replace(str, "$1 $2"), "$1 $2$3");
}

This one takes HTTPRequestFOOBarGork and returns HTTP Request FOO Bar Gork

So I tested the iterative method against the regular expression method using the OPs implementation (with the 'start at 1 and skip the > 0 check' change) and my second reply (the one with the static compiled Regex object). Note that the results do not include the compilation time of the Regex. For 2 million calls (using the same FooBarGork input):

Iterative: 00:00:00.80
Regex: 00:00:06.71

So it is obvious that the iterative approach is much more efficient. I've included a fixed version of the OPs implementation (as suggested by Jason Punyon, any credit should go to him) that also takes into account a null or empty argument:

public static string UnCamelCaseIterative(string str)
{
    if (String.IsNullOrEmpty(str))
        return str;

    /* Note that the .ToString() is required, otherwise the char is implicitly
     * converted to an integer and the wrong overloaded ctor is used */
    StringBuilder sb = new StringBuilder(str[0].ToString());
    for (int i = 1; i < str.Length; i++)
    {
        if (char.IsUpper(str, i))
            sb.Append(" ");
        sb.Append(str[i]);
    }
    return sb.ToString();
}

The default for Regex.Replace is to replace all occurences, so no flag is necessary. — Sean Bright, Jan 27 '09 at 16:38
I shouldn't flame this, as it's a valid solution, but there is really no need for RegEx for such a task. As a point of interest, it's worth posting, but I wouldn't recommend using it in any proper code. (I am rather unbelieving of how you consider it to be more readable!) — Noldorin, Jan 27 '09 at 16:44
@Noldorin - it may not be for everyone, but with the only alternative being the iterative approach, I consider this much more concise. It's obvious to me by looking at the regex that I am looking for a lowercase letter followed by an uppercase one, and replacing them with themselves and a space. — Sean Bright, Jan 27 '09 at 16:49
I came in here to say how regex tickles my heart, only to find Noldorin being a downer again. Don't worry, Noldorin, BE HAPPY! — , Jan 27 '09 at 17:11
@Sean: that's quite fine. I understand that it's a largely subjective thing (as are many things in coding). If it's readable to you and the OP, then it's all good. I just thought that it's always good to post an alternative viewpoint. — Noldorin, Jan 27 '09 at 17:22
@Will: I'm not going to incite another flame war that you seem to want to start. Let's try and be courteous and leave things how they are. — Noldorin, Jan 27 '09 at 17:42
I'd say that if you find this regex unclear then you probably need to study regexes and work with them much more. Keeping your eyes wide shut isn't a good practice. — PEZ, Jan 28 '09 at 10:14

score 2 · Answer 2 · answered Jan 27 '09 at 16:36

2

Why not start i at 1?

You'll get to eliminate the && i>0 check...

answered Jan 27 '09 at 16:36

Restore the Data Dumps

38,967
12
96
122

As is, that leaves the first character left out of the StringBuilder. Easily addressed, but worth noting. – Travis Nov 03 '14 at 23:32

score 1 · Answer 3 · answered Jan 27 '09 at 16:37

1

Usually my decamelisation methods are a bit more complex, as "HTTPRequest" should become "HTTP Request" rather than "H T T P Request", and different applications handle digits differently too.

answered Jan 27 '09 at 16:37

Pete Kirkham

48,893
5
92
171

+1 - Good point. I've added an implementation of this in my answer. – Sean Bright Jan 27 '09 at 16:56

score 1 · Answer 4 · answered Jan 27 '09 at 16:58

1

And here's a PHP example

function spacify($str) {
  return preg_replace('/([a-z])([A-Z])/', "\1 \2", $str);
}

answered Jan 27 '09 at 16:58

TJ L

23,914
7
59
77

score 0 · Answer 5 · answered Jan 27 '09 at 16:35

Looking at your code, it seems that it's somehow been mangled (when you copied it over perhaps). Apart from fixing the for loop, I assume you're just missing an if statement with a char.IsUpper call around the sb.Append(" ") bit. Otherwise it's all fine of course. You're not going to get any better than O(n) for a generic string.

Now there is obviously a one-line RegEx replace call to accomplish this, but really there's no reason to do such things for such a simple task. Always best to avoid RegEx when you can for the purposes of readability.

Ok, so you fixed the code and it's just what I expected. Your implementation is perfectly fine. — Noldorin, Jan 27 '09 at 16:38

score 0 · Answer 6 · answered Jan 27 '09 at 16:35

0

I'd probably do it in a similar way, just maybe instead of a stringbuilder go with:

str=str.replace(str[i], " "+str[i]);

I'm pretty sure your way ends up being more efficient though.

answered Jan 27 '09 at 16:35

Ian Jacobs

5,456
1
23
38

score 0 · Answer 7 · answered Jan 27 '09 at 16:36

0

I'd go with...

public static string UnCamelCase(string str) {
    Regex reg = new Regex("([A-Z])");

    return reg.Replace(str, " $1").Trim();
}

answered Jan 27 '09 at 16:36

Stoo

234
1
2

score 0 · Answer 8 · answered Jan 27 '09 at 16:36

Some regex flavors know the "\u" (upper-case) and "\U" (lower-case) character classes. They can replace this:

(?<=\U)(?=\u)

with a space. For those who you might not know these classes, this will do:

(?<=[a-z])(?=[A-Z])   // replace with a single space again

Explanation: The regex matches the spot between a lower-case and an upper-case character. CamelCasedWords are the only constructs where this usually happens.

CamelCasedWord
    ^^   ^^           // match occurs between the ^

Full Unicode support can be achieved with: (?<=\p{Ll})(?=\p{Lu}) — Tomalak, Jan 27 '09 at 16:41

score 0 · Answer 9 · answered Jan 27 '09 at 16:39

0

Somthing like this (Python)?

>>> s = 'FooBarGork'
>>> s[0] + re.sub(r'([A-Z])', r' \1', s[1:])
'Foo Bar Gork'

answered Jan 27 '09 at 16:39

PEZ

16,821
7
45
66

Ian G · Answer 10 · 2009-01-27T16:47:11.633

Not very exciting but:

    public static string UnCamelCase(string str)
    {
        StringBuilder sb = new StringBuilder();

        foreach (char c in str.ToCharArray())
        {
            if (System.Convert.ToInt32(c) <= 90) sb.Append(" ");
            sb.Append(c);
        }
        return sb.ToString().Trim();
    }


        //Console.WriteLine(System.Convert.ToInt32('a')); // 97
        //Console.WriteLine(System.Convert.ToInt32('z')); // 122
        //Console.WriteLine(System.Convert.ToInt32('A')); // 65
        //Console.WriteLine(System.Convert.ToInt32('Z')); // 90

TJ L · Answer 11 · 2009-01-27T16:58:31.577

Here's how the mootools javascript library does it (although they 'hyphenate', it's pretty easy to swap the hyphen for a space.

/*
Property: hyphenate
    Converts a camelCased string to a hyphen-ated string.

Example:
    >"ILikeCookies".hyphenate(); //"I-like-cookies"
*/

hyphenate: function(){
    return this.replace(/\w[A-Z]/g, function(match){
        return (match.charAt(0) + '-' + match.charAt(1).toLowerCase());
    });
},

score 0 · Answer 12 · answered Jul 05 '18 at 04:31

To get index of Of Upper case

short syntax

Regex.Match("hello,World!", @"(\p{Lu})").Index

result 6

long example

using System.Text.RegularExpressions;

namespace namespace.Helpers
{
    public static class Helper
    {
        public static int IndexOfUppercase(this string str, int startIndex = 0)
        {
            return str.IndexOfRegex(@"(\p{Lu})", startIndex);
        }

        public static int IndexOfRegex(this string str, string regex, int startIndex )
        {
            return str.Substring(startIndex).IndexOfRegex(regex);
        }

        public static int IndexOfRegex(this string str, string regex)
        {
            var match = Regex.Match(str, regex);
            if (match.Success)
            {
                return match.Index;
            }
            return -1;
        }
    }
}

score 0 · Answer 13 · answered Aug 09 '11 at 14:43

0

echo "FooBarGork" | sed -r 's/([A-Z])/ \1/g;s/^ //'

answered Aug 09 '11 at 14:43

user unknown

35,537
11
75
121

An algorithm to "spacify" CamelCased strings

13 Answers13

Linked

Related