2

Say I have a csv file, example.csv, that looks like this (double quotes added by excel):

Id,Name,requestJson
12345,Albert,"{
    ""latitude"": -43.518703,
    ""longitude"": -71.69634,
    ""tags"": [
      ""aliqua"",
      ""ad"",
      ""dolor"",
      ""culpa"",
      ""sunt"",
      ""consequat"",
      ""irure""
    ],
    ""friends"": [
      {
        ""id"": 0,
        ""name"": ""Bryan Montoya""
      },
      {
        ""id"": 1,
        ""name"": ""Marcella Tillman""
      },
      {
        ""id"": 2,
        ""name"": ""Leola Calderon""
      }
    ],
    ""greeting"": ""Hello, undefined! You have 7 unread messages."",
    ""favoriteFruit"": ""strawberry""
  }"

The RequestJson would deserialize into the below objects:

public class Friend
{
    public int id { get; set; }
    public string name { get; set; }
}

public class Request
{
    public double latitude { get; set; }
    public double longitude { get; set; }
    public List<string> tags { get; set; }
    public List<Friend> friends { get; set; }
    public string greeting { get; set; }
    public string favoriteFruit { get; set; }
}

My attempt starts with reading example.csv, skipping the headers, then passing the string array values to FromCsv to split it into the 3 attributes within the Request object.

public static List<Request> LoadFiles()
{
    List<Request> requests = File.ReadAllLines("./example.csv")
                                    .Skip(1)
                                    .Select(v => FromCsv(v))
                                    .ToList();
    return requests;
}

Here I am using the array indices because I know where the first two elements are. The problem is, when I try to retrieve values[2], the split delimiter has failed as there are escape characters and commas in the json.

public static Request FromCsv(string csvLine)
{
    string[] values = csvLine.Split(',');
    Request request = new Request
    {
        Id = values[0],
        Name = values[1],
        Request = JsonConvert.DeserializeObject<Request>(values[2])
    };
    return request;
}

How can I parse the RequestJson column into my desired Request Object?


  • 1
    Try to split into a maximum number of substrings (in this example `3`: `ID`,`Name`,`JSON`), because you do not want to split the JSON , see: https://learn.microsoft.com/en-us/dotnet/api/system.string.split?view=net-5.0#System_String_Split_System_Char___System_Int32_ – Luuk Sep 09 '21 at 17:49
  • @Luuk Thanks, updated my post with the solution. – supernovascotia Sep 09 '21 at 18:23
  • 1
    @supernovascotia don't add answers to your question. Post an answer instead. – Pranav Hosangadi Sep 09 '21 at 19:47
  • Does this answer your question? [Reading CSV files using C#](https://stackoverflow.com/questions/3507498/reading-csv-files-using-c-sharp) – Charlieface Sep 10 '21 at 00:33
  • 1
    Splitting on commas and trying to parse it yourself is going to end very badly. Use a proper CSV parser, such as `Microsoft.VisualBasic.FileIO.TextFieldParser` – Charlieface Sep 10 '21 at 00:34
  • @supernovascotia Please do not amend your question with the solution. Please leave a post and mark it as the answer. – Peter Csala Sep 10 '21 at 07:43
  • Use csvhelper to read csv properly and then you can deserilize the json object – Rahul Shukla Sep 10 '21 at 16:18

1 Answers1

0

Thanks @luuk. Got a working solution.

The solution was to provide a max number of splits. Then remove all the double quotes that csv's input (CsvColumnToJson)

Solution:

public static string GetJson (string csvLine)
{
    string[] values = csvLine.Split(',', 3);
    return CsvColumnToJson(values[2]);       
}

/* Csv json formatting escapes double quotes with more double quotes and adds double quotes to the beginnning and end of the Json */
public static string CsvColumnToJson(string csvColumn)
{
    var duplicatedDoubleQuotesRemoved = csvColumn.Replace("\"\"", "\"");
    return duplicatedDoubleQuotesRemoved.Substring(0, duplicatedDoubleQuotesRemoved .Length - 1).Substring(1);
}
  • You should mark your answer as THE answer (to let others know that this solves your problem, without the need to actually read it). (If you do not know how to do it, see: https://stackoverflow.com/help/someone-answers ) – Luuk Sep 10 '21 at 15:49
  • While this seems to solve your issue, I'd advocate, as other did here, to use CSV parser, which will handle most edge cases for you. Including escaping, multilining, .... CSV isn't standardized and there can be plenty of case that require custom handling. Good example [here](https://stackoverflow.com/a/3508572/588868) – Steve B Sep 10 '21 at 16:03
  • I'd love to use an out of the box parser, but my issue was with the json text column. I couldn't find a way to parse the json and delimit the csv at the same time. This isn't a perfect solution, but since I reliably know what my input data will be, it serves my purposes. – supernovascotia Sep 13 '21 at 18:25