43

I have a variable name, say "WARD_VS_VITAL_SIGNS", and I want to convert it to Pascal case format: "WardVsVitalSigns"

WARD_VS_VITAL_SIGNS -> WardVsVitalSigns

How can I make this conversion?

doppelgreener
  • 4,809
  • 10
  • 46
  • 63
wlz
  • 563
  • 1
  • 4
  • 9

11 Answers11

81

You do not need a regular expression for that.

var yourString = "WARD_VS_VITAL_SIGNS".ToLower().Replace("_", " ");
TextInfo info = CultureInfo.CurrentCulture.TextInfo;
yourString = info.ToTitleCase(yourString).Replace(" ", string.Empty);
Console.WriteLine(yourString);
Quality Catalyst
  • 6,531
  • 8
  • 38
  • 62
Nilesh
  • 2,583
  • 5
  • 21
  • 34
45

Here is my quick LINQ & regex solution to save someone's time:

using System;
using System.Linq;
using System.Text.RegularExpressions;

public string ToPascalCase(string original)
{
    Regex invalidCharsRgx = new Regex("[^_a-zA-Z0-9]");
    Regex whiteSpace = new Regex(@"(?<=\s)");
    Regex startsWithLowerCaseChar = new Regex("^[a-z]");
    Regex firstCharFollowedByUpperCasesOnly = new Regex("(?<=[A-Z])[A-Z0-9]+$");
    Regex lowerCaseNextToNumber = new Regex("(?<=[0-9])[a-z]");
    Regex upperCaseInside = new Regex("(?<=[A-Z])[A-Z]+?((?=[A-Z][a-z])|(?=[0-9]))");

    // replace white spaces with undescore, then replace all invalid chars with empty string
    var pascalCase = invalidCharsRgx.Replace(whiteSpace.Replace(original, "_"), string.Empty)
        // split by underscores
        .Split(new char[] { '_' }, StringSplitOptions.RemoveEmptyEntries)
        // set first letter to uppercase
        .Select(w => startsWithLowerCaseChar.Replace(w, m => m.Value.ToUpper()))
        // replace second and all following upper case letters to lower if there is no next lower (ABC -> Abc)
        .Select(w => firstCharFollowedByUpperCasesOnly.Replace(w, m => m.Value.ToLower()))
        // set upper case the first lower case following a number (Ab9cd -> Ab9Cd)
        .Select(w => lowerCaseNextToNumber.Replace(w, m => m.Value.ToUpper()))
        // lower second and next upper case letters except the last if it follows by any lower (ABcDEf -> AbcDef)
        .Select(w => upperCaseInside.Replace(w, m => m.Value.ToLower()));

    return string.Concat(pascalCase);
}

Example output:

"WARD_VS_VITAL_SIGNS"          "WardVsVitalSigns"
"Who am I?"                    "WhoAmI"
"I ate before you got here"    "IAteBeforeYouGotHere"
"Hello|Who|Am|I?"              "HelloWhoAmI"
"Live long and prosper"        "LiveLongAndProsper"
"Lorem ipsum dolor..."         "LoremIpsumDolor"
"CoolSP"                       "CoolSp"
"AB9CD"                        "Ab9Cd"
"CCCTrigger"                   "CccTrigger"
"CIRC"                         "Circ"
"ID_SOME"                      "IdSome"
"ID_SomeOther"                 "IdSomeOther"
"ID_SOMEOther"                 "IdSomeOther"
"CCC_SOME_2Phases"             "CccSome2Phases"
"AlreadyGoodPascalCase"        "AlreadyGoodPascalCase"
"999 999 99 9 "                "999999999"
"1 2 3 "                       "123"
"1 AB cd EFDDD 8"              "1AbCdEfddd8"
"INVALID VALUE AND _2THINGS"   "InvalidValueAnd2Things"
chviLadislav
  • 1,204
  • 13
  • 15
  • 4
    This answer does not have enough upvotes. Nice utility. Thanks! – Mihir Jul 19 '18 at 07:06
  • yeah, this or some other efficient version (not sure if it's possible to implement it without regex) should be part of .NET – xhafan Apr 25 '19 at 12:32
  • This answer is too complex to be considered quick. Also there is not a true understanding of how to handle character classes in regular expressions in an efficient way. I show how to do such a replacement more efficiently in my answer below. – ΩmegaMan Oct 20 '21 at 20:55
23

First off, you are asking for title case and not camel-case, because in camel-case the first letter of the word is lowercase and your example shows you want the first letter to be uppercase.

At any rate, here is how you could achieve your desired result:

string textToChange = "WARD_VS_VITAL_SIGNS";
System.Text.StringBuilder resultBuilder = new System.Text.StringBuilder();

foreach(char c in textToChange)
{
    // Replace anything, but letters and digits, with space
    if(!Char.IsLetterOrDigit(c))
    {
        resultBuilder.Append(" ");
    }
    else 
    { 
        resultBuilder.Append(c); 
    }
}

string result = resultBuilder.ToString();

// Make result string all lowercase, because ToTitleCase does not change all uppercase correctly
result = result.ToLower();

// Creates a TextInfo based on the "en-US" culture.
TextInfo myTI = new CultureInfo("en-US",false).TextInfo;

result = myTI.ToTitleCase(result).Replace(" ", String.Empty);

Note: result is now WardVsVitalSigns.

If you did, in fact, want camel-case, then after all of the above, just use this helper function:

public string LowercaseFirst(string s)
{
    if (string.IsNullOrEmpty(s))
    {
        return string.Empty;
    }

    char[] a = s.ToCharArray();
    a[0] = char.ToLower(a[0]);

    return new string(a);
}

So you could call it, like this:

result = LowercaseFirst(result);
Karl Anderson
  • 34,606
  • 12
  • 65
  • 80
  • Why does this not make result = Wardvsvitalsigns? – William Melani Sep 05 '13 at 03:32
  • @KarlAnderson `if(!Char.IsLetterOrDigit(c)) { resultBuilder.Append(" "); }` should be `if (!Char.IsLetterOrDigit(c)) { resultBuilder.Append(" "); } else { resultBuilder.Append(c); }`, otherwise , resultBuilder is always empty . ;-) – wlz Sep 05 '13 at 04:14
  • 5
    I think technically it's called Pascal case (or Upper camel case) – Benjol Sep 05 '13 at 04:40
  • @Benjol Thank you for reminding , I will update the question . – wlz Sep 05 '13 at 05:49
  • The result is not Title Case since it removes spaces and probably doesn't have rules for "a", "the", "and", etc. – Zorgarath Sep 10 '20 at 19:37
12

Single semicolon solution:

public static string PascalCase(this string word)
{
    return string.Join("" , word.Split('_')
                 .Select(w => w.Trim())
                 .Where(w => w.Length > 0)
                 .Select(w => w.Substring(0,1).ToUpper() + w.Substring(1).ToLower()));
}
WhiteleyJ
  • 1,393
  • 1
  • 22
  • 29
  • 1
    Not sure what this was intending to be but this certainly does not result in PascalCase. PascalCase doesn't contain spaces... – MgSam Mar 19 '20 at 20:12
  • This is joining every bit of the text, don't see empty spaces – DanielV Mar 16 '22 at 11:56
  • It's been 5 years since I wrote this but I'd agree, don't really know what the spaces problem was. It's splitting on the underscore, trimming the spaces off, filtering out any zero length strings, then uppercasing the first and lowercasing the rest, then using string.join to put everything back together (should use string builder underneath). Meh! – WhiteleyJ Mar 17 '22 at 14:11
  • @WhiteleyJ FYI, The spaces problem would occur if `word` contains spaces instead of `_`. Since `Trim()` only removes spaces before and after the split word – Duckdoom5 Nov 18 '22 at 15:55
9

Extension method for System.String with .NET Core compatible code by using System and System.Linq.

Does not modify the original string.

.NET Fiddle for the code below

using System;
using System.Linq;

public static class StringExtensions
{
    /// <summary>
    /// Converts a string to PascalCase
    /// </summary>
    /// <param name="str">String to convert</param>

    public static string ToPascalCase(this string str){

        // Replace all non-letter and non-digits with an underscore and lowercase the rest.
        string sample = string.Join("", str?.Select(c => Char.IsLetterOrDigit(c) ? c.ToString().ToLower() : "_").ToArray());

        // Split the resulting string by underscore
        // Select first character, uppercase it and concatenate with the rest of the string
        var arr = sample?
            .Split(new []{'_'}, StringSplitOptions.RemoveEmptyEntries)
            .Select(s => $"{s.Substring(0, 1).ToUpper()}{s.Substring(1)}");

        // Join the resulting collection
        sample = string.Join("", arr);

        return sample;
    }
}

public class Program
{
    public static void Main()
    {
        Console.WriteLine("WARD_VS_VITAL_SIGNS".ToPascalCase()); // WardVsVitalSigns
        Console.WriteLine("Who am I?".ToPascalCase()); // WhoAmI
        Console.WriteLine("I ate before you got here".ToPascalCase()); // IAteBeforeYouGotHere
        Console.WriteLine("Hello|Who|Am|I?".ToPascalCase()); // HelloWhoAmI
        Console.WriteLine("Live long and prosper".ToPascalCase()); // LiveLongAndProsper
        Console.WriteLine("Lorem ipsum dolor sit amet, consectetur adipiscing elit.".ToPascalCase()); // LoremIpsumDolorSitAmetConsecteturAdipiscingElit
    }
}
Jani Hyytiäinen
  • 5,293
  • 36
  • 45
2
var xs = "WARD_VS_VITAL_SIGNS".Split('_');

var q =

    from x in xs

    let first_char = char.ToUpper(x[0]) 
    let rest_chars = new string(x.Skip(1).Select(c => char.ToLower(c)).ToArray())

    select first_char + rest_chars;
Rodrick Chapman
  • 5,437
  • 2
  • 31
  • 32
2

Some answers are correct but I really don't understand why they set the text to LowerCase first, because the ToTitleCase will handle that automatically:

var text = "WARD_VS_VITAL_SIGNS".Replace("_", " ");

TextInfo textInfo = CultureInfo.CurrentCulture.TextInfo;
text = textInfo.ToTitleCase(text).Replace(" ", string.Empty);

Console.WriteLine(text);
Dr TJ
  • 3,241
  • 2
  • 35
  • 51
  • Because ToTitleCase is not efficient in many use cases, see this [answer](https://stackoverflow.com/a/46095771/1659999) in current page. – Fawad Raza Nov 26 '19 at 09:32
  • ToTitleCase doesnt make other characters lowercase, at least, not in .NET Core 3.1 where i just needed it. So had to do a ToLower first, then it was correct. – Viezevingertjes Feb 17 '20 at 15:50
2

You can use this:

public static string ConvertToPascal(string underScoreString)
    {
        string[] words = underScoreString.Split('_');

        StringBuilder returnStr = new StringBuilder();

        foreach (string wrd in words)
        {
            returnStr.Append(wrd.Substring(0, 1).ToUpper());
            returnStr.Append(wrd.Substring(1).ToLower());

        }
        return returnStr.ToString();
    }
Rajesh Kumar
  • 69
  • 1
  • 8
2

This answer understands that there are Unicode categories which can be tapped while processing the text to ignore the connecting characters such as - or _. In regex parlance it is \p (for category) then the type which is {Pc} for punctuation and connector type character; \p{Pc} using our MatchEvaluator which is kicked off for each match within a session.

So during the match phase, we get words and ignore the punctuations, so the replace operation handles the removal of the connector character. Once we have the match word, we can push it down to lowercase and then only up case the first character as the return for the replace:

public static class StringExtensions
{
    public static string ToPascalCase(this string initial)
        => Regex.Replace(initial, 
                       // (Match any non punctuation) & then ignore any punctuation
                         @"([^\p{Pc}]+)[\p{Pc}]*", 
                         new MatchEvaluator(mtch =>
        {
            var word = mtch.Groups[1].Value.ToLower();

            return $"{Char.ToUpper(word[0])}{word.Substring(1)}";
        }));
}

Usage:

"TOO_MUCH_BABY".ToPascalCase(); // TooMuchBaby
"HELLO|ITS|ME".ToPascalCase();  // HelloItsMe

See Word Character in Character Classes in Regular Expressions

Pc Punctuation, Connector. This category includes ten characters, the most commonly used of which is the LOWLINE character (_), u+005F.

ΩmegaMan
  • 29,542
  • 12
  • 100
  • 122
2

If you did want to replace any formatted string into a pascal case then you can do

    public static string ToPascalCase(this string original)
    {
        string newString = string.Empty;
        bool makeNextCharacterUpper = false;
        for (int index = 0; index < original.Length; index++)
        {
            char c = original[index];
            if(index == 0)
                newString += $"{char.ToUpper(c)}";
            else if (makeNextCharacterUpper)
            {
                newString += $"{char.ToUpper(c)}";
                makeNextCharacterUpper = false;
            }
            else if (char.IsUpper(c))
                newString += $" {c}";
            else if (char.IsLower(c) || char.IsNumber(c))
                newString += c;
            else if (char.IsNumber(c))
                newString += $"{c}";
            else
            {
                makeNextCharacterUpper = true;   
                newString += ' ';
            }
        }

        return newString.TrimStart().Replace(" ", "");
    }

Tested with strings I|Can|Get|A|String ICan_GetAString i-can-get-a-string i_can_get_a_string I Can Get A String ICanGetAString

Tom McDonough
  • 1,176
  • 15
  • 18
  • 1
    Apart from its generality, I prefer this approach because it's more efficient and arguably clearer than other suggestions. Its efficiency can be improved, though, by making newString a StringBuilder, and consistently appending individual characters (rather than sometimes strings). – Daniel Apr 23 '22 at 21:52
  • 1
    @Daniel Thanks, and I agree with using a string builder. Would be insightful to performance test between the two – Tom McDonough Apr 28 '22 at 09:58
  • As a heads-up, "`any formatted string`" is technically incorrect: it fails for `I|CAN|GET|A|STRING`, `I-CAN-GET-A-STRING`, etc. While I'd say it's not possible to handle an all-caps string with no delimiter, in the event of a delimiter being present it would be good to handle this case (since it does handle all lower-case cases). – Daevin May 04 '22 at 13:46
  • As a correction to above: it "does handle all-lower-case cases with a delimiter". It also fails with multiple non-' '-character delimiters (i.e. `i__can__get__a__string`). – Daevin May 04 '22 at 13:57
1

I found this gist useful after adding a ToLower() to it.

"WARD_VS_VITAL_SIGNS"
.ToLower()
.Split(new [] {"_"}, StringSplitOptions.RemoveEmptyEntries)
.Select(s => char.ToUpperInvariant(s[0]) + s.Substring(1, s.Length - 1))
.Aggregate(string.Empty, (s1, s2) => s1 + s2)
Homer
  • 7,594
  • 14
  • 69
  • 109