-1

I need to process a large amount of csv data in real time as it is spat out by a TCP port. Here is an example as displayed by Putty:

MSG,3,1920,742,4009C5,14205994,2017/01/29,20:14:27.065,2017/01/29,20:14:27.972,,8000,,,51.26582,-0.33783,,,0,0,0,0
MSG,4,1920,742,4009C5,14205994,2017/01/29,20:14:27.065,2017/01/29,20:14:27.972,,,212.9,242.0,,,0,,,,,
MSG,1,1920,742,4009C5,14205994,2017/01/29,20:14:27.065,2017/01/29,20:14:27.972,BAW469,,,,,,,,,,,
MSG,3,1920,742,4009C5,14205994,2017/01/29,20:14:27.284,2017/01/29,20:14:27.972,,8000,,,51.26559,-0.33835,,,0,0,0,0
MSG,4,1920,742,4009C5,14205994,2017/01/29,20:14:27.284,2017/01/29,20:14:27.972,,,212.9,242.0,,,0,,,,,

I need to put each line of data in string (line) into an array (linedata[]) so that I can read and process certain elements, but linedata = line.Split(','); seems to ignore the many empty elements, with the result that linedata[20], for example, may or may not exist, and if it doesn't I get an error if I try to read it. Even if element 20 in the line contains a value it won't necessarily be the 20th element in the array. And that's no good.

I can work out how to parse line character by character into linedata[], inserting an empty string where appropriate, but surely there must be a better way ? Have I missed something obvious ?

Many Thanks. Perhaps I'd better add that I'm quite new to C#, my past experience is all with Delphi 7. I really miss stringlists.

Edited: sorry, this is now resolved with the help of MSDN's documentation. This code works: lineData = line.Split(separators, StringSplitOptions.None); after setting "string[] separators = { "," };". My big mistake was to follow examples found on tutorial sites which didn't give any clues that the .split method had any options.

MikeH
  • 3
  • 4
  • please put clearest output data in your question – Ali Jan 30 '17 at 19:02
  • 1
    I recommend you search [NuGet](https://www.nuget.org) for a CSV reader, there [are plenty out there](https://www.nuget.org/packages?q=csv). No need to re-write it especially if you are not proficient in c# to begin with. – Igor Jan 30 '17 at 19:06
  • 5
    It works for me (each array has 22 entries). Show your actual `Split` code - are you using the `RemoveEmptyEntries` option? – D Stanley Jan 30 '17 at 19:08
  • Igor: Sorry, but I don't want to use a 3rd party component, I want to learn how to use string.split() myself. – MikeH Jan 30 '17 at 21:52
  • @D Stanley: my split code is in my question: "linedata=line.split()". I copied it from several examples on tutorial sites. None of them said anything about options... – MikeH Jan 30 '17 at 22:02
  • @combo_ci: sorry you needed more, but everyone else seems to have understood. – MikeH Jan 30 '17 at 22:04

4 Answers4

3

https://msdn.microsoft.com/en-us/library/system.stringsplitoptions(v=vs.110).aspx

That link has an example section, look at example 1b specifically. There is an extra parameter to Split called StringSplitOptions which does this.

For Example:

    string[] linedata = line.Split(charSeparators, StringSplitOptions.None);

    foreach (string line in linedata)
    {
        Console.Write("<{0}>", line);
    }
    Console.Write("\n\n");

The way to find this sort of information is to start with the Reference Documentation for the function, and hope it has an option or a link to a similar function.

If you want to also start validating types, handling variants in the format etc... you could move up to a CSV library. If you do not need that functionality, this is the easiest way and efficient for small files.

Community
  • 1
  • 1
Jethro
  • 3,029
  • 3
  • 27
  • 56
  • thanks. I was naïve enough to follow the examples I found on various sites - including here. None of them mentioned any options. In fact I eventually, found the info you reference, got my code working properly, then returned here to give myself the answer only to find that you did so in the meantime. – MikeH Jan 30 '17 at 22:09
  • No worries, glad your process turned out OK. You should accept whoever gave the best answer (using the tick icon next to the answer), to help the next guy with the same question (: – Jethro Jan 31 '17 at 09:29
2

Some of the overloads for String.Split() take a StringSplitOptions argument, and if you use the RemoveEmptyEntries option, it will...remove the empty entries. So you can specify the None option:

linedata = line.Split(new [] { ',' }, StringSplitOptions.None);

Or better yet, use the overload that doesn't take a StringSplitOptions, which treats it as None by default:

linedata = line.Split(',');

The code in your question indicates that you are doing this, but your description of the problem suggests that you are not.

However, you're probably better off using an actual CSV parser, which would handle things like unescaping and so on.

JLRishe
  • 99,490
  • 19
  • 131
  • 169
  • Thanks, my code was indeed "linedata = line.Split(',');". To be honest I'm not even sure what an overload is... But when I changed it to "lineData = line.Split(separators, StringSplitOptions.None);" having defined seperators[] as a comma, suddenly my array started filling up properly. I only need to parse a single line at a time, and I want to learn how to do this myself. – MikeH Jan 30 '17 at 22:23
0

The StringReader class provides methods for reading lines, characters, or blocks of characters from a string. Hope this could be the clue

    string str = @"MSG,3,1920,742,4009C5,14205994,2017/01/29,20:14:27.065,2017/01/29,20:14:27.972,,8000,,,51.26582,-0.33783,,,0,0,0,0
                   MSG,4,1920,742,4009C5,14205994,2017/01/29,20:14:27.065,2017/01/29,20:14:27.972,,,212.9,242.0,,,0,,,,,
                   MSG,1,1920,742,4009C5,14205994,2017/01/29,20:14:27.065,2017/01/29,20:14:27.972,BAW469,,,,,,,,,,,
                   MSG,3,1920,742,4009C5,14205994,2017/01/29,20:14:27.284,2017/01/29,20:14:27.972,,8000,,,51.26559,-0.33835,,,0,0,0,0
                   MSG,4,1920,742,4009C5,14205994,2017/01/29,20:14:27.284,2017/01/29,20:14:27.972,,,212.9,242.0,,,0,,,,,";


    using (StringReader reader = new StringReader(str))
        do
        {
            string[] linedata = reader.ReadLine().Split(',');

        } while (reader.Read() != -1);
bashkan
  • 464
  • 1
  • 5
  • 14
  • Thanks, that's an interesting alternative. Actually I am only processing one line at a time. Each line begins with "MSG" and ends with a linefeed. – MikeH Jan 30 '17 at 22:32
0

While you should look into the various ways the String class can help you here, sometimes the quick and dirty "MAKE it fit" option is called for. In this case, that'd be to roll through the strings in advance and ensure you have at least one character between the commas.

public static string FixIt(string s)
{
       return s.Replace(",,", ", ,");
}

You should be able to:

var lineData = FixIt(line).Split(',');

Edit: In response to the question below, I'm not sure what you meant, but if you mean doing it without creating a helper method, you can do so easily. The code will be harder to read and troubleshoot if you do it in one line though. My personal rule is, if you have to do it a LOT, it should probably be a method. If you only had to do it once, this is particularly clean. I'd actually do it this way and just wrap it in a method that does all the work for you.

var lineData = line.Replace(",,", ", ,").Split(',');

As a method, it'd be:

 public static string[] GiveMeAnArray(string s)
    {
           return s.Replace(",,", ", ,").Split(',');
    }
CDove
  • 1,940
  • 10
  • 19