0

I have a string as below, which needs to be split to an array, using VB.NET

10,"Test, t1",10.1,,,"123"

The result array must have 6 rows as below

10
Test, t1
10.1
(empty)
(empty)
123

So: 1. quotes around strings must be removed 2. comma can be inside strings, and will remain there (row 2 in result array) 3. can have empty fields (comma after comma in source string, with nothing in between)

Thanks

bzamfir
  • 4,698
  • 10
  • 54
  • 89

4 Answers4

4

Don't use String.Split(): it's slow, and doesn't account for a number of possible edge cases.

Don't use RegEx. RegEx can be shoe-horned to do this accurately, but to correctly account for all the cases the expression tends to be very complicated, hard to maintain, and at this point isn't much faster than the .Split() option.

Do use a dedicated CSV parser. Options include the Microsoft.VisualBasic.TextFieldParser type, FastCSV, linq-to-csv, and a parser I wrote for another answer.

Community
  • 1
  • 1
Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
1

You can write a function yourself. This should do the trick:

Dim values as New List(Of String)
Dim currentValueIsString as Boolean
Dim valueSeparator as Char = ","c
Dim currentValue as String = String.Empty

For Each c as Char in inputString
   If c = """"c Then
     If currentValueIsString Then
        currentValueIsString = False
     Else 
        currentValueIsString = True
     End If
   End If

   If c = valueSeparator Andalso not currentValueIsString Then
     If String.IsNullOrEmpty(currentValue) Then currentValue = "(empty)"
     values.Add(currentValue)
     currentValue = String.Empty
   End If

   currentValue += c
Next
Fabian Bigler
  • 10,403
  • 6
  • 47
  • 70
1

Here's another simple way that loops by the delimiter instead of by character:

Public Function Parser(ByVal ParseString As String) As List(Of String)
    Dim Trimmer() As Char = {Chr(34), Chr(44)}
    Parser = New List(Of String)
    While ParseString.Length > 1
        Dim TempString As String = ""
        If ParseString.StartsWith(Trimmer(0)) Then
            ParseString = ParseString.TrimStart(Trimmer)
            Parser.Add(ParseString.Substring(0, ParseString.IndexOf(Trimmer(0))))
            ParseString = ParseString.Substring(Parser.Last.Length)
            ParseString = ParseString.TrimStart(Trimmer)
        ElseIf ParseString.StartsWith(Trimmer(1)) Then
            Parser.Add("")
            ParseString = ParseString.Substring(1)
        Else
            Parser.Add(ParseString.Substring(0, ParseString.IndexOf(Trimmer(1))))
            ParseString = ParseString.Substring(ParseString.IndexOf(Trimmer(1)) + 1)
        End If
    End While
End Function

This returns a list. If you must have an array just use the ToArray method when you call the function

tinstaafl
  • 6,908
  • 2
  • 15
  • 22
0

Why not just use the split method?

Dim s as String = "10,\"Test, t1\",10.1,,,\"123\""
s = s.Replace("\"","")
Dim arr as String[] = s.Split(',')

My VB is rusty so consider this pseudo-code

Mataniko
  • 2,212
  • 16
  • 18
  • won't work because it will split also the the second "field", "Test, T1", which I don't want – bzamfir Jun 24 '13 at 22:18
  • It's a start, you definitely need to take care of edge cases yourself or use a library like the other answer suggested. – Mataniko Jun 24 '13 at 22:20