78

Possible Duplicate:
Regular expression, split string by capital letter but ignore TLA

I have a string which is a combination of several words, each word is capitalized.
For example: SeveralWordsString

Using C#, how do I split the string into "Several Words String" in a smart way?

Thanks!

Community
  • 1
  • 1
Nir
  • 3,963
  • 8
  • 37
  • 51
  • Splitting suggests that you want an array of strings, but it looks like you rather want to insert spaces in the string? – Guffa Dec 20 '10 at 11:00

5 Answers5

104

Use this regex (I forgot from which stackoverflow answer I sourced it, will search it now):

 public static string ToLowercaseNamingConvention(this string s, bool toLowercase)
        {
            if (toLowercase)
            {
                var r = new Regex(@"
                (?<=[A-Z])(?=[A-Z][a-z]) |
                 (?<=[^A-Z])(?=[A-Z]) |
                 (?<=[A-Za-z])(?=[^A-Za-z])", RegexOptions.IgnorePatternWhitespace);

                return r.Replace(s, "_").ToLower();
            }
            else
                return s;
        }

I use it in this project: http://www.ienablemuch.com/2010/12/intelligent-brownfield-mapping-system.html

[EDIT]

I found it now: How do I convert CamelCase into human-readable names in Java?

Nicely split "TodayILiveInTheUSAWithSimon", no space on front of " Today":

using System;
using System.Text.RegularExpressions;

namespace TestSplit
{
    class MainClass
    {
        public static void Main (string[] args)
        {
            Console.WriteLine ("Hello World!");



            var r = new Regex(@"
                (?<=[A-Z])(?=[A-Z][a-z]) |
                 (?<=[^A-Z])(?=[A-Z]) |
                 (?<=[A-Za-z])(?=[^A-Za-z])", RegexOptions.IgnorePatternWhitespace);


            string s = "TodayILiveInTheUSAWithSimon";
            Console.WriteLine( "YYY{0}ZZZ", r.Replace(s, " "));
        }
    }
}

Output:

 YYYToday I Live In The USA With SimonZZZ
Community
  • 1
  • 1
Michael Buen
  • 38,643
  • 9
  • 94
  • 118
90
string[] SplitCamelCase(string source) {
    return Regex.Split(source, @"(?<!^)(?=[A-Z])");
}

Sample:

https://dotnetfiddle.net/0DEt5m

nik0lai
  • 2,585
  • 23
  • 37
  • 14
    good answer. use `return string.Join(" ", Regex.Split(value, @"(?<!^)(?=[A-Z](?![A-Z]|$))"));` if you don`t want uppercase abbreviations being split. – śmiglidigli May 12 '17 at 15:39
  • 1
    An old thread but I found this useful. This is an extension method I adapted from this answer: `public static string SplitCamelCase(this string input, string delimeter = " ") { return input.Any(char.IsUpper) ? string.Join(delimeter, Regex.Split(input, "(?<!^)(?=[A-Z])")) : input; }`. This allows you to specify the delimiter and will return the input string without executing the RegEx if the input string does not contain any capital letters. Sample usage: `var s = myString.SplitCamelCase();` or `var s = myString.SplitCamelCase(" ,");` – Anders Aug 04 '17 at 18:47
  • This splits `CamelCase` and `pascalCase` – Themelis Feb 10 '20 at 03:24
40

You can just loop through the characters, and add spaces where needed:

string theString = "SeveralWordsString";

StringBuilder builder = new StringBuilder();
foreach (char c in theString) {
  if (Char.IsUpper(c) && builder.Length > 0) builder.Append(' ');
  builder.Append(c);
}
theString = builder.ToString();
LukeH
  • 263,068
  • 57
  • 365
  • 409
Guffa
  • 687,336
  • 108
  • 737
  • 1,005
6
    public static IEnumerable<string> SplitOnCapitals(string text)
    {
        Regex regex = new Regex(@"\p{Lu}\p{Ll}*");
        foreach (Match match in regex.Matches(text))
        {
            yield return match.Value;    
        }
    }

This will handle Unicode properly.

StanislawSwierc
  • 2,571
  • 17
  • 23
4
            string str1 = "SeveralWordsString";
            string newstring = "";
            for (int i = 0; i < str1.Length; i++)
            {
                if (char.IsUpper(str1[i]))
                    newstring += " ";                    
                newstring += str1[i].ToString();
            }
Rajesh Kumar G
  • 1,424
  • 5
  • 18
  • 30