1

Currently I have CSV along these lines :

"NAME","AGE","SEX"
"FRED, JONES","45","MALE"
"SALLY, SMITH","60","FEMALE"

And I use the following code to serialize it into JSON :

var linesCSV = System.IO.File.ReadAllLines(targetFile); //target file is the csv

var csv = linesCSV.Select(l => l.Split(',')).ToList();

var headers = csv[0];
var dicts = csv.Skip(1).Select(row => Enumerable.Zip(headers, row, System.Tuple.Create).ToDictionary(p => p.Item1, p => p.Item2)).ToArray();

string json = new System.Web.Script.Serialization.JavaScriptSerializer().Serialize(dicts);

jsWrtr.WriteLine(json);

This gets outputted like so :

[{
  "\NAME\"" : "\"FRED\"",
  "\AGE\"" : "\"JONES\"",
  "\SEX\"" : "\"45\""
},
{
  "\NAME\"" : "\"SALLY\"",
  "\AGE\"" : "\"SMITH\"",
  "\SEX\"" : "\"60\""
}]

You can see the NAME gets split up and the second part, part after the comma, gets put into the next field.

This is obviously because of the comma inbetween, but my question is how do I just parse the CSV so it outputs the following :

[{
   "NAME" : "FRED, JONES",
   "AGE" : "45",
   "SEX" : "MALE"
 },
 {
   "NAME" : "SALLY, SMITH",
   "AGE" : "60",
   "SEX" : "FEMALE"
 }]
thatOneGuy
  • 9,977
  • 7
  • 48
  • 90
  • You are splitting the fields by a comma, but should you not be doing so by double quotes ? – Veverke May 25 '16 at 15:03
  • You can use [CsvHelper library](https://joshclose.github.io/CsvHelper/) with custom map. Not the simplest solution but it has some advantages. – Fabjan May 25 '16 at 15:25

2 Answers2

1

As a work around you could split on ", " and trim the remaining double quotes where necessary. This should leave FRED, JONES as a single entity in the split. You would have to add the quotes back on if they were required however.

datalife
  • 21
  • 1
  • 2
  • 5
1

You can split by "," instead, plus trimming the input string by ".

    List<string> lines = new List<string>
{
    "\"NAME\", \"AGE\", \"SEX\"",
    "\"FRED, JONES\", \"45\", \"MALE\"",
    "\"SALLY, SMITH\", \"60\", \"FEMALE\""
};

    foreach (var line in lines.Skip(1))
    {
        var fields = line.Trim(new char[] { '"' }).Split(new string[] { "\", \"" }, StringSplitOptions.None);

        foreach (var field in fields)
            Console.WriteLine(field.Trim());

        Console.WriteLine();
    }

This will extract the fields properly, and you can move on to the json serialization.

enter image description here

Update:

Here's an update for the json serialization, giving you an output like you want:

    foreach (var line in lines.Skip(1))
    {
        var fields = line.Trim(new char[] { '"' }).Split(new string[] { "\", \"" }, StringSplitOptions.None);

        Entry entry = new Entry { Name = fields.FirstOrDefault(), Age = fields.Skip(1).FirstOrDefault(), Sex = fields.LastOrDefault() };
        results.Add(entry);
    }

    var json = JsonConvert.SerializeObject(results);

Note that for simplicity I created a class named Entry that contains 3 strings, one for each field, but you may want to use different types (and will then need to properly parse the values).

Note that I use Newtonsoft's Json nuget library for serialization - you seem to be using something else. Unless you need to stick with your library, I recommend the widely used Newtonsoft.

enter image description here

Veverke
  • 9,208
  • 4
  • 51
  • 95