Populating a 2D Matrix of List from csv

Question

I have a matrix of routes stored into a csv from a pandas dataframe. The content of this csv looks like:

,hA,hB,hC
hA,[],["hA","hB"],["hA","hB","hC"]
hB,["hB","hA"],[],["hB","hC"]
hC,["hC","hB","hA"],["hC","hB"],[]

From this file I would like to generate a matrix in c#, so I could get the route from hA to hC with something like:

routes["hA"]["hC"]

I can achieve this generating manually a Dictionary<string, Dictionary<string, List<string>>> like:

Dictionary<string, Dictionary<string, List<string>>> routes = new Dictionary<string, Dictionary<string, List<string>>>(){
                {"hA", new Dictionary<string, List<string>>(){ { "hA", new List<string>() }, {"hB", new List<string>() { "hA", "hB" }}, { "hC", new List<string>() { "hA", "hB", "hC" }}}},
                { "hB", new Dictionary<string, List<string>>() { { "hA", new List<string>() { "hB", "hA" }}, { "hB", new List<string>() { }, { "hC", new List<string>() { "hB", "hC" }}}},
                { "hC", new Dictionary<string, List<string>>() { { "hA", new List<string>() { "hC", "hB", "hA" }}, { "hB", new List<string>() { "hC", "hB" }}, { "hC", new List<string>() { } }}}
            };

But in case the size of the matrix increases or everytime a route changes it is a lot of reworking involved. Thant is why I could like to populate the routes matrix from the csv directly

Is there any way of populating this matrix from a csv? or is it a better type of collections to store this routes instead of Dictionary<string, Dictionary<string, List<string>>>?

At even this level of "dictionary of dictionary of list" nesting I'd be looking for a different data storage structure — Caius Jard, Nov 12 '21 at 13:28
@CaiusJard thanks, I also feel must be something simpler. I was working with this in a pandas dataframe in python and it is really straight forward there, but when trying to get the equivalent in c#, I couldnt come up with anything better. I am open to suggestion from anyone with more experience than me in c# — Ignacio Alorre, Nov 12 '21 at 13:30
*it is really straight forward there* - have you got an example of how you modeled it there, and why it was so simple for the use case? Some of us know pandas.. — Caius Jard, Nov 12 '21 at 13:30
I agree with Caius, i think there looks more like a dictionary of lists arrays... `Dictionary>`from the data snip bit — Seabizkit, Nov 12 '21 at 13:31
well staying that that structure doesn't seem to make sense, and when it doesn't make sense you will have a hard time trying to make sense of it. — Seabizkit, Nov 12 '21 at 13:33
@CaiusJard the csv was provided. It was generated with a graph algorithm. But to read it and generate the pandas dataframe is just pd.read_csv("routes_mtrx.csv", index_col=0) — Ignacio Alorre, Nov 12 '21 at 13:35
in general I think what you want would be not a `Dictionary` which links one key to a value but rather a [Multi-key Dictionary](https://stackoverflow.com/questions/1171812/multi-key-dictionary-in-c/15804355) which then can simply link two keys to a certain value (your list) so you could probably use something like `Dictionary, List>` — derHugo, Nov 12 '21 at 14:11

Caius Jard · Answer 1 · 2021-11-12T14:00:01.233

Oof.. I think I'd read that CSV with a parser library set to use [ and ] as "quote" chars, but this will read it simplistically:

var lines = File.ReadAllLines(path);

var cols = lines[0].Split(',');

var frame = new Dictionary<string, Dictionary<string, string[]>>();

foreach(var line in lines.Skip(1)){

  var bits = line.Replace("]", "").Split(",[");

  var row = bits[0];

  for(int i = 1; i < cols.Length; i++){
    var col = cols[i];
    
    frame.TryAdd(row, new Dictionary<string, string[]>());

    frame[row][col] = bits[i].Split(',').Select(s => s.Trim('"')).ToArray(); 
  }
}

That should deliver you your nested dictionaries so you can address them like you would a dataframe.. By the way, I don't know what happens if you ask a dataframe for something that isn't there, but c# would throw a KeyNotFoundException if you asked for eg frame["hZ"]["hello"] ..

If you want the innermost storage container to be a List you can swap the ToArray to be ToList

You perhaps don't need to nest, by the way:

var frame = new Dictionary<(string, string), string[]>();

foreach(var line in lines.Skip(1)){

  var bits = line.Replace("]", "").Split(",[");

  var row = bits[0];

  for(int i = 1; i < cols.Length; i++){
    var col = cols[i];
    
    frame[(row, col)] = bits[i].Split(',').Select(s => s.Trim('"')).ToArray(); 
  }
}

It could be queried like frame[("hA","hB")]

thanks, I will try this one. Still, is there a better way to achieve the same? I was considering as an option an array of array of List, and with the help of a Map translate the index from string to integer. Like "hA" -> 0, "hB"-> 1... would that option be more efficient? — Ignacio Alorre, Nov 12 '21 at 13:57
I'd need to see a more detailed CSV; all the row and column names in the one you provided were unique, so you shouldn't have any issues with repeated keys in the sample.. but if column or rows aren't unique and you want it to be additive (eg duplixcate row names cause the cells to get more entries) let me know — Caius Jard, Nov 12 '21 at 15:35
Made it work, wrote a solution based on your approach, just changing a few things. Thanks again for the help! — Ignacio Alorre, Nov 12 '21 at 15:35

score 1 · Answer 2 · answered Nov 12 '21 at 15:15

Turn your node names (ie, hA, hB, hC) into an enum:

enum Indexer {
    hA = 0,
    hB = 1,
    hC = 2
}

Use a two-dimensional array of lists:

List<string>[,] Matrix = new List<string>[3,3];

Access the data out of the Matrix:

List<string> path = Matrix[(int)Indexer.hA, (int)Indexer.hC];

If you need to, you can convert the text-based node names back to an enum:

var n = (Indexer)Enum.Parse(typeof(Indexer), "hA");

This assumes that you are importing a csv of pre-defined node names. Let me know if the node names can't be pre-defined and I'll update the answer.

Thanks! I believe it can work. I was able to find a solution using Dictionaries, but this one may work as well — Ignacio Alorre, Nov 12 '21 at 15:28

Ignacio Alorre · Answer 3 · 2021-11-12T15:40:50.467

Based on @CaiusJard and @derHugo suggestions

I needed to modify a little bit the original csv file to make it easier by removing the first column (which cointaed the index) and using ";" as column separator df_routes.to_csv("routes_mtrx_headers.csv", sep = ';', index = False)

The final solution is

var route_dictionary = new Dictionary<(string, string), string[]>();

using (var reader = new StreamReader(@"C:/mypath/routes_mtrx_headers.csv"))
{
    string[] locations = reader.ReadLine().Split(';');
    int rowIdx = 0;
    int colIdx = 0;

    while (!reader.EndOfStream)
    {
        var row = reader.ReadLine();
        var columns = row.Split(';');
        colIdx = 0;

        foreach (var col in columns)
        {
            // Removing the List wrapper
            var path = col.Replace("]", "").Replace("[", "").Split(',');
            route_dictionary.Add((locations[colIdx], locations[rowIdx]), path);
            colIdx += 1;
        }
        rowIdx += 1;
    }
}

// Finally to access each element in the matrix
var route = route_dictionary[("hA", "hC")];

Populating a 2D Matrix of List from csv

3 Answers3