1

I'm working on a program in which user inputs some data, for example:

222, "test", 2 + 2

And I have to split this string by ',' char into an array, so before I was using this method:

string[] parameters = userInput.Split (',');

But, now it came to my mind what if user inputs something like this:

345, "test ,,,,,, ,,,,, ,,,,", 89

Commas are only allowed in the quote characters in my project.

What is the fastest way to split that string into an array, having in mind that problem?

EDIT: It is not parsing CSV file

EDIT 2:

It is expected to return {"345", "\"test ,,,,,, ,,,,, ,,,,\"", "89"} - 3 elements in this array

Cœur
  • 37,241
  • 25
  • 195
  • 267
107MP
  • 171
  • 1
  • 9

4 Answers4

1

EDIT 2

Assuming that you want to return constant number of parameters, you may be interested in Regex.Split function.

var parameters = Regex.Split(userInput, @"^(?<first>\d+), (?<second>\D+), (?<third>\d+)$",
                                    RegexOptions.ExplicitCapture)
                            .Where(a=>a!=string.Empty)
                            .ToList();

Above code returns a List<string>{345, "test ,,,,,, ,,,,, ,,,,", 89}

EDIT 3

If you want to return an array, replace above code with:

string[] parameters = Regex.Split(userInput, @"^(?<first>\d+), (?<second>\D+), (?<third>\d+)$",
                                    RegexOptions.ExplicitCapture)
                            .Where(a=>a!=string.Empty)
                            .ToArray();

Thank you Lasee V. Karlsen for your valuable comment.

Community
  • 1
  • 1
Maciej Los
  • 8,468
  • 1
  • 20
  • 35
1

OP added Edit2 after I post this
Will leave that as and exercise for the OP

bool inQuote = false; 
bool inComma = true;
List<string> words = new List<string>();
StringBuilder sb = new StringBuilder();
foreach (char c in input) 
{
   if(c == '"')
   {
      if(inQuote)
      {
         inComma = false;
         if(!String.IsnullOrEmpty(sb.ToString()) 
         {
             words.Add(sb.ToString().Trim;
             sb.Clear();
         }
         inQuote = !inQuote;              
         continue;
      }
   }
   if (c == ',' && !inQuote)
   {
      if(inComma)
      {
         if(!String.IsnullOrEmpty(sb.ToString()) 
         {
             words.Add(sb.ToString().Trim;
             sb.Clear();
         }
         inComma = !inComma; 
         continue;
      }
   }
   sb.Add(c);
}
if(!String.IsnullOrEmpty(sb.ToString()) 
   words.Add(sb.ToString().Trim());
sb.Clear();
foreach (string s in words) 
{
   if(sb.Len > 0)
      sb.Append(", ");
   sb.Append(@"\"" + s + @"\""); // not sure if the is the correct syntax for "
}
Console.WriteLine(sb.ToString();

you need to deal with edge cases like

, sdlf"aslkd"
, sdlf"aslkd ,
what about c and neither is open?

This is too much for Split or Regex when you consider all possibilities.

paparazzo
  • 44,497
  • 23
  • 105
  • 176
0

I've implemented something like this by looping over the string. What you need is a flag that indicates whether you are within a quoted string or not.

When you are not within a quoted string and encounter a comma, you cut everything up to the current position into a new entry of the result list.

When you encounter a quote outside a quoted string, set the flag.

When the flag is set you ignore all commas. When you encounter another quote, reset the flag.

That's the algorithm roughly.

That said, you can take a look at the Microsoft.VisualBasic.FileIo.TextFieldParser class, which might already do what you need. Don't worry, you can use it in C#, too, despite the namespace

Thorsten Dittmar
  • 55,956
  • 8
  • 91
  • 139
0

if order doesn't matter:

static void Main(string[] args)
        {
            string data = "345, \"test ,,,,,, ,,,,, ,,,,\", 89";

            string[] quoteValues = GetValueInQuote(data);

            string[] result = data.Split(quoteValues, StringSplitOptions.RemoveEmptyEntries);


            result = string.Join(string.Empty, result).Replace(" ", string.Empty).Split(new char[1]{','}, StringSplitOptions.RemoveEmptyEntries);

            result = result.Concat(quoteValues).ToArray();

        }

        static string[] GetValueInQuote(string data)
        {
            int quoteCount = data.Where(c => c == '\"').Count();



            if (quoteCount % 2 == 1)
                throw new Exception("an odd number of quotes");


            string[] result = new string[quoteCount / 2];



            for (int i = 0; i < result.Length; i++)
            {
                int first = data.IndexOf('\"');

                int second = data.IndexOf('\"', first + 1);


               result[i] = data.Substring(first, second - first + 1);
            }

            return result;

        }
Nikolay Fedorov
  • 387
  • 2
  • 7