3

I'm trying to use regex to convert a string like this "North Korea" to a string like "northKorea" - does someone know how I might accomplish this in c# ?

Cheers

mike
  • 67
  • 2
  • 5
  • 3
    why regex? why not replace all spaces for empty strings and lowercase the first character? – Bazzz Aug 19 '11 at 06:48

7 Answers7

7

if you know all your input strings are in title case (like "North Korea") you can simply do:

string input = "North Korea"; 
input = input.Replace(" ",""); //remove spaces
string output = char.ToLower(input[0]) + 
              input.Substring(1); //make first char lowercase
                                  // output = "northKorea"

if some of your input is not in title case you can use TextInfo.ToTitleCase

string input = "NoRtH kORea"; 
input = System.Globalization.CultureInfo.CurrentCulture.TextInfo.ToTitleCase(input);
input = input.Replace(" ",""); //remove spaces
string output = char.ToLower(input[0]) + 
          input.Substring(1); //make first char lowercase
                              // output = "northKorea"
Paolo Falabella
  • 24,914
  • 3
  • 72
  • 86
4

Forget regex.
All you need is a camelCase conversion algorithm:

See here: http://www.codekeep.net/snippets/096fea45-b426-40fd-8beb-dec49d8a8662.aspx

Use this one:

string camelCase = ConvertCaseString(a, Case.CamelCase);

Copy-pasted in case it goes offline:

void Main() {
    string a = "background color-red.brown";
    string camelCase = ConvertCaseString(a, Case.CamelCase);
    string pascalCase = ConvertCaseString(a, Case.PascalCase);
}

/// <summary>
/// Converts the phrase to specified convention.
/// </summary>
/// <param name="phrase"></param>
/// <param name="cases">The cases.</param>
/// <returns>string</returns>
static string ConvertCaseString(string phrase, Case cases)
{
    string[] splittedPhrase = phrase.Split(' ', '-', '.');
    var sb = new StringBuilder();

    if (cases == Case.CamelCase)
    {
        sb.Append(splittedPhrase[0].ToLower());
        splittedPhrase[0] = string.Empty;
    }
    else if (cases == Case.PascalCase)
        sb = new StringBuilder();

    foreach (String s in splittedPhrase)
    {
        char[] splittedPhraseChars = s.ToCharArray();
        if (splittedPhraseChars.Length > 0)
        {
            splittedPhraseChars[0] = ((new String(splittedPhraseChars[0], 1)).ToUpper().ToCharArray())[0];
        }
        sb.Append(new String(splittedPhraseChars));
    }
    return sb.ToString();
}

enum Case
{
    PascalCase,
    CamelCase
}
Stefan Steiger
  • 78,642
  • 66
  • 377
  • 442
  • 1
    +1 Nice approach, but it does feel a little like shooting a mosquito with a cannon. It's still good and relevant info, though, even if only to let the OP rethink his/her Regex ideas. :) – Bazzz Aug 19 '11 at 07:00
3

You could just split it and put it back together:

string[] split = ("North Korea").Split(' ');

StringBuilder sb = new StringBuilder();

for (int i = 0; i < split.Count(); i++)
{
    if (i == 0)
        sb.Append(split[i].ToLower());
    else
        sb.Append(split[i]);
}

Edit: Switched to a StringBuilder instead, like Bazzz suggested.

Joakim Johansson
  • 3,196
  • 1
  • 27
  • 43
  • +1 I like the approach, but string concats are not such a good idea for performance and memory management. Perhaps you should add the same example with a StringBuilder. – Bazzz Aug 19 '11 at 06:57
  • I agree with @Bazzz every time Split/Concat is used a unicorn dies. – Jonathan Dickinson Aug 19 '11 at 09:17
2

This builds on Paolo Falabella's answer as a String extension and handles a few boundary cases such as empty string. Since there is some confusion between CamelCase and camelCase, I called it LowerCamelCase as described on Wikipedia. I resisted the temptation to go with nerdCaps.

internal static string ToLowerCamelCase( this string input )
{
    string output = "";            
    if( String.IsNullOrEmpty( input ) == false  )
    {
        output = System.Globalization.CultureInfo.CurrentCulture.TextInfo.ToTitleCase( input ); //in case not Title Case
        output = output.Replace( " ", "" ); //remove any white spaces between words
        if( String.IsNullOrEmpty( output ) == false )  //handles the case where input is "  "
        {
            output = char.ToLower( output[0] ) + output.Substring( 1 ); //lowercase first (even if 1 character string)
        }
    }
    return output;
}

Use:

string test = "Foo Bar";
test = test.ToLowerCamelCase();
... //test is now "fooBar"

Update: toong raised a good point in the comments - this will not work for graphemes. See the link provided by toong. There are also examples of iterating graphemes here and here if you want to tweak the above code for graphemes.

Community
  • 1
  • 1
acarlon
  • 16,764
  • 7
  • 75
  • 94
  • 1
    I like your approach. One very minor issue (which is in all answers here I guess): output[0] might fail if the first letter is composed of combining characters ("a\u0304\u0308" = "ā̈") or surrogate pairs ("\uD950\uDF21" = "��") ? You might want to use StringInfo.GetNextTextElement - see: http://msdn.microsoft.com/en-us/library/y0hcb622.aspx – toong Dec 03 '13 at 08:25
  • @toong - that is a good point. I have updated to reference your comment and added some links. – acarlon Dec 03 '13 at 10:16
1

Try the following:

var input = "Hi my name is Rony";
var subStrs = input.ToLower().Split(' ');
var output = "";

foreach(var s in subStrs)
{
   if(s!=subStrs[0])
      output += s.First().ToString().ToUpper() + String.Join("", s.Skip(1));
   else
      output += s;
}

should get "hiMyNameIsRony" as the output

mahfuz01
  • 453
  • 4
  • 20
1

String::Split definitely is one of my pet peeves. Also, none of the other answers deal with:

  • Cultures
  • All forms of word seperators
  • Numbers
  • What happens when it starts with word seperators

I tried to get it as close as possible to what you would find in base class library code:

static string ToCamelCaseInvariant(string value) { return ToCamelCase(value, true, CultureInfo.InvariantCulture); }
static string ToCamelCaseInvariant(string value, bool changeWordCaps) { return ToCamelCase(value, changeWordCaps, CultureInfo.InvariantCulture); }

static string ToCamelCase(string value) { return ToCamelCase(value, true, CultureInfo.CurrentCulture); }
static string ToCamelCase(string value, bool changeWordCaps) { return ToCamelCase(value, changeWordCaps, CultureInfo.CurrentCulture); }

/// <summary>
/// Converts the given string value into camelCase.
/// </summary>
/// <param name="value">The value.</param>
/// <param name="changeWordCaps">If set to <c>true</c> letters in a word (apart from the first) will be lowercased.</param>
/// <param name="culture">The culture to use to change the case of the characters.</param>
/// <returns>
/// The camel case value.
/// </returns>
static string ToCamelCase(string value, bool changeWordCaps, CultureInfo culture)
{
    if (culture == null)
        throw new ArgumentNullException("culture");
    if (string.IsNullOrEmpty(value))
        return value;

    var result = new StringBuilder(value.Length);
    var lastWasBreak = true;
    for (var i = 0; i < value.Length; i++)
    {
        var c = value[i];
        if (char.IsWhiteSpace(c) || char.IsPunctuation(c) || char.IsSeparator(c))
        {
            lastWasBreak = true;
        }
        else if (char.IsNumber(c))
        {
            result.Append(c);
            lastWasBreak = true;
        }
        else
        {
            if (result.Length == 0)
            {
                result.Append(char.ToLower(c, culture));
            }
            else if (lastWasBreak)
            {
                result.Append(char.ToUpper(c, culture));
            }
            else if (changeWordCaps)
            {
                result.Append(char.ToLower(c, culture));
            }
            else
            {
                result.Append(c);
            }

            lastWasBreak = false;
        }
    }

    return result.ToString();
}

// Tests
'  This is a test. 12345hello world' = 'thisIsATest12345HelloWorld'
'--north korea' = 'northKorea'
'!nOrTH koreA' = 'northKorea'
'System.Console.' = 'systemConsole'
Jonathan Dickinson
  • 9,050
  • 1
  • 37
  • 60
0
    string toCamelCase(string s)
    {
        if (s.Length < 2) return s.ToLower();
        return Char.ToLowerInvariant(s[0]) + s.Substring(1);
    }

similar to Paolo Falabella's code but survives empty strings and 1 char strings.

citykid
  • 9,916
  • 10
  • 55
  • 91