-1

I am parsing a text file from some old hardware and I need to extract an array of strings from one long string.

  • Elements are comma-separated
  • Each individual string is enclosed in single quotes

'Hello', 'World'

{"'Hello', "'World'"}

Single quotes for any other reason are illegal and there is no escaping character equivalent; you cannot write the word "Bob's" nor can do something like "Bob\'s".

Ordinarily it would be easy to get the individual elements by splitting on the comma:

string testString = "'Hello', 'World'";
ArrayList result = new ArrayList(testString.Split(',');

But I can't do this; if a comma is in-between single quotes it is text, not a separator.

This is one element:

'Hello, World'

{"'Hello, World'"}

How can I extract the elements checking to see if the comma is in-between single quotes?

'Hello', 'World', 'Hello, World'

{"'Hello'", "'World'", "'Hello, World'"}

One more detail: I cannot guarantee the amount of whitespace between elements:

'Hello',     'World',  'Hello, World'

P.S. Here is the same question I asked for Swift: Swift: Split comma-separated strings, but not if comma enclosed in single quotes

MH175
  • 2,234
  • 1
  • 19
  • 35
  • Have you considered using a RegEx? – Max Dec 06 '17 at 02:35
  • Thanks. Yes, I'm just not good at it. – MH175 Dec 06 '17 at 02:39
  • Split it using `'''` instead, then the tokens you get will be alternating between text enclosed by '' and separator in between (consisting on commas and whitespace) – lamandy Dec 06 '17 at 03:07
  • @MH175, just do the filtering, the actual tokens will be at alternating index, starting from 1 I believe as the 0th position should be unrelated text before the first '. – lamandy Dec 06 '17 at 03:13
  • @MH175, I just read the question in your Swift link, the idea is similar to his whereby you tokenize by single quote, then remove tokens consists of a single comma with whitespaces. – lamandy Dec 06 '17 at 03:17
  • Thanks. Sorry I'm very rusty on C#. How can I remove the comma elements regardless of the amount of whitespace surrounding them? – MH175 Dec 06 '17 at 03:20
  • @MH175 Is there any chance that element will contain a single quote? how would the single quote be escaped inside an element? e.g. `'I don\'t know'`. – Evan Huang Dec 06 '17 at 03:28
  • It sounds like you should be using an [actual CSV parser](http://ctl-global.github.io/data.html). – Cory Nelson Dec 06 '17 at 03:31
  • `string.Replace()` comes to mind first then `string.Split()` which can all be done in a single line – MethodMan Dec 06 '17 at 04:02

3 Answers3

2

You have not answered whether the strings can contain embedded single quotes & how they might be escaped.

If every string follows the pattern of [SingleQuote][Text][SingleQuote], here's a RegEx that will do what you need:

'[^']+'

If you have empty strings in the single quotes, use:

'[^']*'

Ashley Pillay
  • 868
  • 4
  • 9
0

The easiest way would be using Json deserialzer :

    private string[] F(string input){    
        return Newtonsoft.Json.
           JsonConvert.DeserializeObject("["+input+"]",typeof(string[]));
       }
// now call F:
var result= F("'Hello','World','Hello,World'");
nAviD
  • 2,784
  • 1
  • 33
  • 54
-2

I tried this approach but I wonder if this works most of the time.

        const string str = "'Hello', 'World', 'Hello, World'";
        ArrayList arrStr = new ArrayList(str.Split('\''));
        List<string> myString = new List<string>();
        for (int i = 0 + 1; i < arrStr.Count; i+=2)
        {
            myString.Add(arrStr[i].ToString());
        }

Also, i'm not sure if your string has contractions in it.

Zwei James
  • 73
  • 9