0

Here is the example line,

(((EXAMPLE_WORD1 - EXAMPLE_WORD2)/EXAMPLE_WORD2) * 100)

I want to split above line as below,

(
(
(
EXAMPLE_WORD1
-
EXAMPLE_WORD2
)
/
EXAMPLE_WORD2
)
*
100
)

How can I do the above task in C# code?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Pon
  • 315
  • 1
  • 5
  • 10

5 Answers5

3

You can do something like this :

string str=  "(((EXAMPLE_WORD1 - EXAMPLE_WORD2)/EXAMPLE_WORD2) * 100)";
string[] arr = str.Split(new char[]{'/','*','(',')'},SplitOption.RemoveEmpty);

UPDATE1 : In previous solution , the splitter characters will remove from the arr. Maybe the better solution is here :

string str=  "(((EXAMPLE_WORD1 - EXAMPLE_WORD2)/EXAMPLE_WORD2) * 100)";
str = str.replace("(","#(#").replace("/","#/#").replace(")","#)#").replace("*","#*#");
string[] arr = str.Split(new char[]{'#'},SplitOption.RemoveEmpty);

These solution are ideas and I did not check these solutions.edit them to get better result.

Ali Foroughi
  • 4,540
  • 7
  • 42
  • 66
1

This seems to work:

var regex = new Regex(@"(?=(\b|[^a-zA-Z_0-9])+)");
var split = regex.Split("(((EXAMPLE_WORD1 - EXAMPLE_WORD2)/EXAMPLE_WORD2) * 100)");

EDIT: Works now :)

Dave Bish
  • 19,263
  • 7
  • 46
  • 63
1

If what you want is generic word splitter based on several rules then it is not a trivial task. First you need to define what is word for you. Like:

  • word is series of letters(a-zA-Z) with acceptable separator symbol ('_')
  • word is symbol ('(',')','-', '*')
  • word is series of numbers with/without acceptable separator symbol (',','.' - based on culture)

and so on

Only after you define strict rules for what should be treated as word should you start codding.
If this is the case you can read about finite automata or something similar depending on complexity of your task.

EDIT: if provided pattern all you need then the link provided by Bert Evans's is the answer to your solution, namely Regex pattern:

string youString = @"(((EXAMPLE_WORD1 - EXAMPLE_WORD2)/EXAMPLE_WORD2) * 100)";
string[] parts = Regex.Split(yourString, @"(?<=[()-/*])");
Nogard
  • 1,779
  • 2
  • 18
  • 21
0

I extened Ali's Answer to get the exact output

(
(
(
EXAMPLE_WORD1
-
EXAMPLE_WORD2
)
/
EXAMPLE_WORD2
)

*
 100
)



    string str = "(((EXAMPLE_WORD1 - EXAMPLE_WORD2)/EXAMPLE_WORD2) * 100)";
    str = str.Replace("(", "{(}");
    str = str.Replace("*", "{*}");
    str = str.Replace(")", "{)}");
    str = str.Replace("/", "{/}");
    str = str.Replace("-", "{-}");
    string[] arr = str.Split(new char[] { '{', '}' }, StringSplitOptions.RemoveEmptyEntries);
    foreach (string strs in arr)
    {
         Console.WriteLine(strs.Trim());
    }
Naresh
  • 658
  • 1
  • 7
  • 22
  • First: author actually needs symbol `/` to be printed. Second: performance suffers from even looking at the sample, for a real-life applications with massive arrays to parse this could take ages – Nogard Dec 24 '12 at 10:44
  • Corrected the answer. You are correct, for a huge application we need to use regex – Naresh Dec 24 '12 at 10:50
0

While a regex will get you there, you may want to consider specifying a tokenizer of sorts for flexibility and/or scalability:

Here's a naive example:

static IEnumerable<string> Tokenize(string str)
{
    var sb = new StringBuilder();
    foreach (var c in str)
    {
        if(char.IsLetterOrDigit(c) || c == '_')
        {
            sb.Append(c);
        }
        else if (char.IsPunctuation(c))
        {
            if (sb.Length > 0)
            {
                yield return sb.ToString();
                sb.Clear();
            }
            yield return c.ToString(CultureInfo.InvariantCulture);

        }
    }
    if (sb.Length > 0) yield return sb.ToString();
}
static void Main(string[] args)
{
    const string st = "(((EXAMPLE_WORD1 - EXAMPLE_WORD2)/EXAMPLE_WORD2) * 100)";
    Tokenize(st).ToList().ForEach(Console.WriteLine);
}
Anthill
  • 1,219
  • 10
  • 20