4

I've some sentences, those construct with words and digits. I want to get a string that contain 1st char from every word, all digit and the word have all upper case letters. I've tried using Regex but the problem is, it not give all digit and all upper case letters.

My Regex is in Regex101.

My solution is in DotNetFiddle.

CODE:

using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;

public class Program
{
    public static void Main()
    {
        List<string> list = new List<string> {"Freestyle steel","Freestyle Alloy","Trekking steel uk","Single speed","5 speed","15 speed","3 Speed internal gear with 55 coaster","MTB steel","Junior MTB"};
        foreach(string data in list)
        {
            string regex = @"(\b\w)|(\d+)";
            var matches = Regex.Matches(data, regex, RegexOptions.Multiline);
            string output = "";
            foreach(Match item in matches)
            {
                output += item.Groups[1];
            }
            Console.WriteLine(output);
        }
    }
}

Sample Input

Freestyle steel

Freestyle Alloy

Trekking steel uk

Single speed

5 speed

15 speed

3 Speed internal gear with 55 coaster

MTB steel

Junior MTB

Sample Output

Fs

FA

Tsu

Ss

5s

15s

3Sigw55c

MTBs

JMTB

csharpbd
  • 3,786
  • 4
  • 23
  • 32

3 Answers3

1

The regex you may use is

@"[0-9]+|\b(?:\p{Lu}+\b|\w)"

Details:

  • [0-9]+ - one or more digits
  • | - or
  • \b - leading word boundary
  • (?:\p{Lu}+\b|\w) - 1+ uppercase letters followed with a trailing word boundary (\p{Lu}+\b) or any word char (\w).

See this solution:

using System;
using System.Linq;
using System.Collections.Generic;
using System.Text.RegularExpressions;

public class Test
{
    public static void Main()
    {
        var regex = @"[0-9]+|\b(?:\p{Lu}+\b|\w)";
        var list = new List<string> {"Freestyle steel","Freestyle Alloy","Trekking steel uk","Single speed","5 speed","15 speed","3 Speed internal gear with 55 coaster","MTB steel","Junior MTB"};
        foreach(var data in list)
        {
            var matches = Regex.Matches(data, regex).Cast<Match>().Select(m => m.Value.ToUpper());
            Console.WriteLine(string.Join("", matches));
        }
    }
}

Output:

FS
FA
TSU
SS
5S
15S
3SIGW55C
MTBS
JMTB
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
1

You could go for

\d+|\b(?:[A-Z]+|\w)

See a demo on regex101.com.

Jan
  • 42,290
  • 8
  • 54
  • 79
1

You can do it with a replacement:

string input = "3 Speed internal gear with 55 coaster";
string pattern = @"\B[a-z]+|\W+";
string replacement = "";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(input, replacement);

The \B (non word-boundary) asserts that the letter matched by [a-z] is preceded by a word character, and the \W matches any non-word characters.

demo

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125