2

Is there a simple way of searching through an array of csv strings and then writing certain bits of the data in one line. Currently It searches through a csv file and pulls out certain data based on two user inputs from TextBoxes, then sorts this and writes it into another csv file.

Essentially, I would like it to write data on the same line if the device name is the same. Lets say a device is called "dev1 id1" and another device is called "dev1 id2". Instead of writing both on separate lines (like it currently does), I would like it to write in on a single combined line. Something like:

dev1,id1,id2

Instead of:

dev1,id1

dev1,id2

I have tried using for loops and if statements but it gets messy quickly. Bellow is my current code (sorry for any typo's due to work reasons i had to retype it).

StreamWriter sw = new StreamWriter(@"c:\test.csv");
StreamReader sr = new StreamReader(@"c:\rtest.csv");

List<string> list = new List<string>();
string line = "Station,Device,Key,AOR";
string sta = textBox1.Text;
string[] devs = richTextBox1.Text.Split(',').Select(dev => dev.Trim()).ToArray();
string[] sort,strs;
bool cont;

sw.WriteLine(line);

while (!sr.EndOfStream)
{
    strs = line.Split(',');
    cont = (devs.Any(s => strs[1].IndexOf(s, StringComparison.OrdinalIgnoreCase) >= 0));
    if (strs[2].ToString() == sta && cont ==true)
    {
        list.Add(line.ToString());
    }
    line = sr.ReadLine();
}
sort = new string[list.Count];
list.CopyTo(sort);
Array.Sort(sort);

foreach (string var in sort)
{
    strs = var.Split(',');
    sw.WriteLine(string.Format("{2},{1},{0},{3}", strs[0], strs[1], strs[2], strs[3]));
}

sw.Close();
if (File.Exists(@"c:\test.csv")
{
    Process.Start(@"c:\test.csv");
}

Hope my question is understandable, thanks.

Mr_Green
  • 40,727
  • 45
  • 159
  • 271
user1852287
  • 97
  • 2
  • 6

3 Answers3

1

You should be using a 3rd party parser to read your CSV file - it would ease your burden.

For example, using KBCsv, your code can look like this (originally written in VB.NET, I apologize for any conversion errors):

Dictionary<string, List<string>> dict = new Dictionary<string, List<string>>();
using (CsvReader reader = new CsvReader("Input.csv")) {
  foreach (string[] record in reader.DataRecordsAsStrings) {
    //assuming each record contains exactly 2 columns, under index 0 and 1
    string key = record[0];
    string value = record[1];
    List<string> targetList = null;
    if (!dict.TryGetValue(key, out targetList))
    {
      targetList = new List<string>();
      dict.Add(key, targetList);
    }
    targetList.Add(value);
  }
}
List<string> output = new List<string>();
foreach (KeyValuePair<string, List<string>> kv in dict)
{
  string outputCsvLine = kv.Key + "," + string.Join(",", kv.Value);
  output.Add(outputCsvLine);
}
System.IO.File.WriteAllLines("output.csv", output);

For writing the output, you might need to consider escaping CSV special characters, such as a comma and a quote, if those are inside your values.

EDIT: if you really wish to avoid any 3rd party components (for security or other reason), you can change lines 2 and 3 in the above code to this:

using (System.IO.StreamReader reader = new System.IO.StreamReader("Input.csv")) {
  foreach (string rawRecord in reader.ReadLine()) {        
    string[] record = rawRecord.Split(",");

Disclaimer: It is a bad practice to work with CSV files like that, but for your particular example it would work. Rest of the code stays the same.

Victor Zakharov
  • 25,801
  • 18
  • 85
  • 151
  • Thanks for the response, though I am unable to use 3rd party software due to restrictions on the computer running the code. Annoying I know. – user1852287 Nov 27 '12 at 02:26
  • @user1852287: you don't need to install anything, so it's hardly a software, just a couple of DLLs. And those can probably be embedded as source in your program - Ms-PL allows this. – Victor Zakharov Nov 27 '12 at 02:28
  • You don't need 3rd party software - there is a CSV parser in the .NET framework; see this question for details http://stackoverflow.com/questions/1898/csv-file-imports-in-net – Paul Keister Nov 27 '12 at 02:38
  • @PaulKeister: it's slow, as far as I remember. KBCsv is the fastest I measured. – Victor Zakharov Nov 27 '12 at 02:40
1

It is not a reliable approach to simply split CSV rows using commas. Fields can contain commas, or even new lines.

As pointed out, there is a library in .NET that can propery read CSV files. Or you can use the code I presented in the article Reading and Writing CSV Files in C#.

Jonathan Wood
  • 65,341
  • 71
  • 269
  • 466
  • In this case the CSV rows are all identical, its just the numbers which change. However thanks for the advice, will be useful in future code. – user1852287 Nov 27 '12 at 02:52
1

Use Linq to clean up your code.

  • GroupBy to get a set of values per key
  • Select to join the values back together.

All you need is something that implements IEnumerable. I have included a very basic version that splits on ",".

The keySelector lambda extracts the key
The valueSelector lambda extracts the value

    [TestMethod]
    public void CsvParser()
    {
        string columns = "Device,Id";
        string source = "Dev1,id1" + Environment.NewLine + "Dev1,id2" + Environment.NewLine + "Dev2,id3" + Environment.NewLine + "Dev2,id4";

        List<string> columnNames = columns.Split(',').ToList();

        int keyIndex = columnNames.IndexOf("Device");
        int valueIndex = columnNames.IndexOf("Id");

        GroupByKey(keyIndex, valueIndex, source);
    }

    private void GroupByKey(int keyIndex, int valueIndex, string source)
    {
        LineReader reader = new LineReader(new StringReader(source));

        Func<string[], string> keySelector = lineItems => lineItems[keyIndex];
        Func<string[], string> valueSelector = lineItems => lineItems[valueIndex];

        List<string> idsByDev = reader
            // .Skip(1)  <-- Uncomment if first row contains headers
            .GroupBy(keySelector, valueSelector, StringComparer.OrdinalIgnoreCase)
            .Select(device => device.Key + "," + string.Join(",", device))
            .ToList()
            ;

        Console.WriteLine( string.Join(Environment.NewLine, idsByDev ));
    }

    public class LineReader : IEnumerable<string[]>
    {
        private readonly TextReader source;

        public LineReader( TextReader source )
        {
            this.source = source;
        }

        public IEnumerator<string[]> GetEnumerator()
        {
            return new LineReaderEnumerator(this.source);
        }

        IEnumerator IEnumerable.GetEnumerator()
        {
            return GetEnumerator();
        }

        private class LineReaderEnumerator : IEnumerator<string[]>
        {
            private TextReader source;

            public LineReaderEnumerator(TextReader source)
            {
                this.source = source;
            }

            public void Dispose()
            {
                this.source.Dispose();
            }

            public bool MoveNext()
            {
                // Replace these lines with a good CSV parser
                string line = source.ReadLine();

                if (!String.IsNullOrEmpty(line))
                {
                    this.Current = line.Split(',');
                }
                else
                {
                    this.Current = null;
                }

                return this.Current != null;
            }

            public void Reset()
            {
                throw new NotImplementedException();
            }

            public string[] Current { get; private set; }

            object IEnumerator.Current
            {
                get { return Current; }
            }
        }
    }
Robert Slaney
  • 3,712
  • 1
  • 21
  • 25