3

Pretty basic, I'm just curious how others might implement this algorithm and would like to see if there are any clever tricks to optimize the algorithm...I just had to implement this for a project that I am working on.

Given a string in CamelCase, how would you go about "spacifying" it?

e.g. given FooBarGork I want Foo Bar Gork back.

Here is my algorithm in C#:


static void Main(string[] args)
{
    Console.WriteLine(UnCamelCase("FooBarGork"));
}
public static string UnCamelCase(string str)
{
    StringBuilder sb = new StringBuilder();
    for (int i =  0; i < str.Length; i++)
    {
        if (char.IsUpper(str, i) && i > 0) sb.Append(" ");
        sb.Append(str[i]);
    }
    return sb.ToString();
}

Since you have to visit every character once, I believe the best case is O(n). How would you implement this?

mmattax
  • 27,172
  • 41
  • 116
  • 149

13 Answers13

18

I can already sense the flames, but I like regex for this kind of stuff.

public static string UnCamelCase(string str)
{
    return Regex.Replace(str, "([a-z])([A-Z])", "$1 $2");
}

(This may not be faster than your implementation, but to me it is more clear.)

And obviously, this would be even faster (at runtime)

private static Regex _unCamelRegex = new Regex("([a-z])([A-Z])", RegexOptions.Compiled);

public static string UnCamelCase(string str)
{
    return _unCamelRegex.Replace(str, "$1 $2");
}

This would handle the issue brought up by Pete Kirkham below (as far as camel-cased strings like HTTPRequest):

private static Regex _unCamelRegex1 = new Regex("([a-z])([A-Z])", RegexOptions.Compiled);
private static Regex _unCamelRegex2 = new Regex("([A-Z]+)([A-Z])([a-z])", RegexOptions.Compiled);

public static string UnCamelCase(string str)
{
    return _unCamelRegex2.Replace(_unCamelRegex1.Replace(str, "$1 $2"), "$1 $2$3");
}

This one takes HTTPRequestFOOBarGork and returns HTTP Request FOO Bar Gork


So I tested the iterative method against the regular expression method using the OPs implementation (with the 'start at 1 and skip the > 0 check' change) and my second reply (the one with the static compiled Regex object). Note that the results do not include the compilation time of the Regex. For 2 million calls (using the same FooBarGork input):

Iterative: 00:00:00.80
Regex: 00:00:06.71

So it is obvious that the iterative approach is much more efficient. I've included a fixed version of the OPs implementation (as suggested by Jason Punyon, any credit should go to him) that also takes into account a null or empty argument:

public static string UnCamelCaseIterative(string str)
{
    if (String.IsNullOrEmpty(str))
        return str;

    /* Note that the .ToString() is required, otherwise the char is implicitly
     * converted to an integer and the wrong overloaded ctor is used */
    StringBuilder sb = new StringBuilder(str[0].ToString());
    for (int i = 1; i < str.Length; i++)
    {
        if (char.IsUpper(str, i))
            sb.Append(" ");
        sb.Append(str[i]);
    }
    return sb.ToString();
}
Community
  • 1
  • 1
Sean Bright
  • 118,630
  • 17
  • 138
  • 146
  • with a multiple-match flag, sure – Jason S Jan 27 '09 at 16:36
  • The default for Regex.Replace is to replace all occurences, so no flag is necessary. – Sean Bright Jan 27 '09 at 16:38
  • I shouldn't flame this, as it's a valid solution, but there is really no need for RegEx for such a task. As a point of interest, it's worth posting, but I wouldn't recommend using it in any proper code. (I am rather unbelieving of how you consider it to be more readable!) – Noldorin Jan 27 '09 at 16:44
  • @Noldorin - it may not be for everyone, but with the only alternative being the iterative approach, I consider this much more concise. It's obvious to me by looking at the regex that I am looking for a lowercase letter followed by an uppercase one, and replacing them with themselves and a space. – Sean Bright Jan 27 '09 at 16:49
  • I came in here to say how regex tickles my heart, only to find Noldorin being a downer again. Don't worry, Noldorin, BE HAPPY! –  Jan 27 '09 at 17:11
  • @Sean: that's quite fine. I understand that it's a largely subjective thing (as are many things in coding). If it's readable to you and the OP, then it's all good. I just thought that it's always good to post an alternative viewpoint. – Noldorin Jan 27 '09 at 17:22
  • @Will: I'm not going to incite another flame war that you seem to want to start. Let's try and be courteous and leave things how they are. – Noldorin Jan 27 '09 at 17:42
  • I'd say that if you find this regex unclear then you probably need to study regexes and work with them much more. Keeping your eyes wide shut isn't a good practice. – PEZ Jan 28 '09 at 10:14
2

Why not start i at 1?

You'll get to eliminate the && i>0 check...

Restore the Data Dumps
  • 38,967
  • 12
  • 96
  • 122
  • As is, that leaves the first character left out of the StringBuilder. Easily addressed, but worth noting. – Travis Nov 03 '14 at 23:32
1

Usually my decamelisation methods are a bit more complex, as "HTTPRequest" should become "HTTP Request" rather than "H T T P Request", and different applications handle digits differently too.

Pete Kirkham
  • 48,893
  • 5
  • 92
  • 171
1

And here's a PHP example

function spacify($str) {
  return preg_replace('/([a-z])([A-Z])/', "\1 \2", $str);
}
TJ L
  • 23,914
  • 7
  • 59
  • 77
0

Looking at your code, it seems that it's somehow been mangled (when you copied it over perhaps). Apart from fixing the for loop, I assume you're just missing an if statement with a char.IsUpper call around the sb.Append(" ") bit. Otherwise it's all fine of course. You're not going to get any better than O(n) for a generic string.

Now there is obviously a one-line RegEx replace call to accomplish this, but really there's no reason to do such things for such a simple task. Always best to avoid RegEx when you can for the purposes of readability.

Noldorin
  • 144,213
  • 56
  • 264
  • 302
0

I'd probably do it in a similar way, just maybe instead of a stringbuilder go with:

str=str.replace(str[i], " "+str[i]);

I'm pretty sure your way ends up being more efficient though.

Ian Jacobs
  • 5,456
  • 1
  • 23
  • 38
0

I'd go with...

public static string UnCamelCase(string str) {
    Regex reg = new Regex("([A-Z])");

    return reg.Replace(str, " $1").Trim();
}
Stoo
  • 234
  • 1
  • 2
0

Some regex flavors know the "\u" (upper-case) and "\U" (lower-case) character classes. They can replace this:

(?<=\U)(?=\u)

with a space. For those who you might not know these classes, this will do:

(?<=[a-z])(?=[A-Z])   // replace with a single space again

Explanation: The regex matches the spot between a lower-case and an upper-case character. CamelCasedWords are the only constructs where this usually happens.

CamelCasedWord
    ^^   ^^           // match occurs between the ^
Tomalak
  • 332,285
  • 67
  • 532
  • 628
0

Somthing like this (Python)?

>>> s = 'FooBarGork'
>>> s[0] + re.sub(r'([A-Z])', r' \1', s[1:])
'Foo Bar Gork'
PEZ
  • 16,821
  • 7
  • 45
  • 66
0

Not very exciting but:

    public static string UnCamelCase(string str)
    {
        StringBuilder sb = new StringBuilder();

        foreach (char c in str.ToCharArray())
        {
            if (System.Convert.ToInt32(c) <= 90) sb.Append(" ");
            sb.Append(c);
        }
        return sb.ToString().Trim();
    }


        //Console.WriteLine(System.Convert.ToInt32('a')); // 97
        //Console.WriteLine(System.Convert.ToInt32('z')); // 122
        //Console.WriteLine(System.Convert.ToInt32('A')); // 65
        //Console.WriteLine(System.Convert.ToInt32('Z')); // 90
Ian G
  • 29,468
  • 21
  • 78
  • 92
0

Here's how the mootools javascript library does it (although they 'hyphenate', it's pretty easy to swap the hyphen for a space.

/*
Property: hyphenate
    Converts a camelCased string to a hyphen-ated string.

Example:
    >"ILikeCookies".hyphenate(); //"I-like-cookies"
*/

hyphenate: function(){
    return this.replace(/\w[A-Z]/g, function(match){
        return (match.charAt(0) + '-' + match.charAt(1).toLowerCase());
    });
},
TJ L
  • 23,914
  • 7
  • 59
  • 77
0

To get index of Of Upper case

short syntax

Regex.Match("hello,World!", @"(\p{Lu})").Index

result 6

long example

using System.Text.RegularExpressions;

namespace namespace.Helpers
{
    public static class Helper
    {
        public static int IndexOfUppercase(this string str, int startIndex = 0)
        {
            return str.IndexOfRegex(@"(\p{Lu})", startIndex);
        }

        public static int IndexOfRegex(this string str, string regex, int startIndex )
        {
            return str.Substring(startIndex).IndexOfRegex(regex);
        }

        public static int IndexOfRegex(this string str, string regex)
        {
            var match = Regex.Match(str, regex);
            if (match.Success)
            {
                return match.Index;
            }
            return -1;
        }
    }
}
Yitzhak Weinberg
  • 2,324
  • 1
  • 17
  • 21
0
echo "FooBarGork" | sed -r 's/([A-Z])/ \1/g;s/^ //'
user unknown
  • 35,537
  • 11
  • 75
  • 121