1

I would like to read in a CSV into a dictionary, however most examples I see have the first column as keys. I would like the field names to be keys.

CSV File:

Name,Type,Classification
file1.txt,text,Secondary
file2.txt,text,Primary

Output dictionary (what I would like):

dict = {
     Name: [file1.txt, file2.txt],
     Type: [text, text],
     Classification: [Secondary, Primary]
}

I'm not sure if I can extend this somehow:

    private void LoadCSV()
    {
        var mydict = new Dictionary<string, List<string>>();
        var lines = File.ReadAllLines(@"C:\test.csv");

        string[] columnHeaders = lines[0].Split(',');
        foreach (string columnHeader in columnHeaders)
        {
            mydict[columnHeader] = new List<string>();
        }
    }

Edit:

This may be doing what I want, but perhaps not very efficient:

private void LoadCSV()
{
    var mydict = new Dictionary<string, List<string>>();
    var lines = File.ReadAllLines(@"C:\test.csv");
    string[] columnHeaders = lines[0].Split(',');
    string[] allLines = lines.Skip(1).ToArray();

    int count = 0;
    foreach (string headerPart in columnHeaders)
    {
        var valuelist = new List<string>();
        foreach (string line in allLines)
        {
            List<string> l = line.Split(',').ToList();
            var element = l.ElementAt(count);
            valuelist.Add(element);
        }
        mydict[headerPart] = valuelist;
        count++;

    }
}
  • Dictionary is a collection of Key-Value pairs. Can you expand what you envision as Values? Your example is in JSON format, which is confusing but may have confused *you* as well. In any event, the value in JSON looks like an array, but your definition in C# code is `string` – Felix Jun 18 '22 at 22:28
  • Sorry I'm not a C# developer (just been tasked with doing something in C#). I've basically shown what a Python dictionary looks like. For example, `Name` is a Key, and `file1.txt` and `file2.txt` are within a list (or array) and make up the Value. –  Jun 18 '22 at 22:30
  • I am not asking a C# question; I am asking code design question, that wouldn't differ between C# or Python. So, value is an array/list? and you get a new member of this list on every line of the file? can you reword the question so it is clear to the reader. Also, maybe once you do - the answer will be come obvious ;) – Felix Jun 18 '22 at 22:34
  • I only mentioned that I am not a C# developer as I don't know how to convey a dictionary in C#. In the example I've given, `Name`, `Type`, and `Classification` are all Keys. The Values are `[file1.txt, file2.txt]`, `[text, text]`, and `[Secondary, Primary]` which are Lists of items. –  Jun 18 '22 at 22:38
  • Understood. So, first thing - change the value from `string` to `List` and try some code to build the dictionary from the first line in the file where you have all keys, and then add values to the List while going through the rest. You need to show *some* code in order to ask for help. – Felix Jun 18 '22 at 22:45
  • Also, you paragraph after "Edit" significantly changes both **what** you want to do, and *how* you would do it. I suggest you decide for yourself what exactly you are trying to do and update the question accordingly – Felix Jun 18 '22 at 22:54
  • Thanks. I've update the question and code slightly where I *think* I've now got a dict with the column headers as keys. However it's just adding the values I'm unsure on. –  Jun 18 '22 at 23:28
  • you don't need to read the whole file for each column. take a look at my answer – Felix Jun 19 '22 at 00:41

2 Answers2

1

As an alternative to the imperative style you are using, here is a way to perform this using Linq:

var csvPath = @"C:\test.csv";
var separator = ",";

var lines = File.ReadLines(csvPath);

var indexedKeys = lines
    .First()
    .Split(separator, StringSplitOptions.None)
    .Select((k, i) => 
    (
        Index: i,
        Key: k
    ));

var values = lines
    .Skip(1)
    .Select(l => l.Split(separator, StringSplitOptions.None))
    .ToList();

var result = indexedKeys
    .ToDictionary(
        ik => ik.Key,
        ik => values
            .Select(v => v[ik.Index])
            .ToList());

As for your solution efficiency (the code below Edit in your post), with a basic analysis, there are two potential improvements:

File.ReadAllLines unnecessaryly put your whole file in memory. Using File.ReadLine solves that. See this and this for discussion.

You are applying the .Split() method not once on each line, but as many time as there are columns. To fix this, your variable allines should already contains the splited lines. The solution I propose does that, see the variable values.

Laurent Gabiot
  • 1,251
  • 9
  • 15
  • Thanks for this. I'm having a couple issues with your code. I had to change from `"` to `'` for the separator. However, having issues with the `.Select` method. Is it working for you? –  Jun 19 '22 at 09:29
  • I added to the `Split()` call the `StringSplitOption`. If you are not using a recent version of .Net, it might be mandatory. In .NET 6 you can remove it as shown on the method signature: `public string[] Split (string? separator, StringSplitOptions options = System.StringSplitOptions.None);` I fixed the Tuple declaration, as I wrote it incorrectly. The `Select()` call should be fine now. I probably should have used an anonymous object instead: `new { Index = i, Key = k}` – Laurent Gabiot Jun 19 '22 at 10:04
  • Ah okay perfect thank you. Will try implementing this tonight. My specifications have slightly changed but will good to see if I can get this working regardless. –  Jun 19 '22 at 11:31
  • I liked (an upvoted) this solution; however, for new C# developer advanced LINQ might be confusing. For example, `SELECT` with two parameters and `ToDictionary()`. I think, non-LINQ options has better teachable value. But both are valid! – Felix Jun 19 '22 at 20:49
0

Good start! Now that you have column headers, you can build the dictionary. Obviously, you need to keep track of matching column numbers to column names

string[] columnHeaders = lines[0].Split(',');

// do this for each line in the file
string[] columnRows = lines.Split(',');

for (int n = 0; n++; n <= columnRows.Length)
{
    mydict[columnHeaders[n]] = mydict[columnHeaders[n]].Add(columnRows[n]);
}

this way you add a new value to the List that is value of the Dictionary at given index

Felix
  • 9,248
  • 10
  • 57
  • 89
  • Beware, there a few problem here: the first line is not discarded, the for loop goes one step too far. I don't know where the dictionary is defined, but `mydict[columnHeaders[n]] = mydict[columnHeaders[n]].Add(columnRows[n]);` should rather be `mydict[columnHeaders[n]].Add(columnRows[n]);` – Laurent Gabiot Jun 19 '22 at 05:31
  • Thanks. This looks a lot more efficient than my code. I'll play around with it tonight. –  Jun 19 '22 at 11:32
  • @LaurentGabiot - I left some details as an exercise for OP. – Felix Jun 19 '22 at 20:39