2

Does anyone know how to convert the below nested JSON to CSV via CHOETL (An ETL framework for .NET)? Thank you!

I'm using this code but it will only return the first equipment record.

CODE:

                 {
                     using (var json = new ChoJSONReader("./test.json"))
                     {
                         csv.Write(json.Cast<dynamic>().Select(i => new
                         {
                             EquipmentId = i.GpsLocation.Equipment[0].EquipmentId,
                             InquiryValue = i.GpsLocation.Equipment[0].InquiryValue,
                             Timestamp = i.GpsLocation.Equipment[0].Timestamp

                         }));
                     }
                 }

JSON:

    "GpsLocation": {
        "Equipment": [
            {
                "EquipmentId": "EQ00001",
                "InquiryValue": [
                    "IV00001"
                ],
                "Timestamp": "2020-01-01 01:01:01.01",
            },
            {
                "EquipmentId": "EQ00002",
                "InquiryValue": [
                    "IV00002"
                ],
                "Timestamp": "2020-01-01 01:01:01.01"
            }
        ]
    }
}````
  • 1
    You're only doing something with i.GpsLocation.Equipment[0], which is indeed the first Equipment record. I would suspect you want to loop over _all_ the records. – oerkelens Aug 12 '20 at 16:37

3 Answers3

1

As others suggest, the issue is you are only looking at the first element of the array.

It appears that the easiest way to control what you serialise into CSV is by correctly defining your source objects from JSON. JSON Path expressions come in pretty handy.

What I ended up doing here is query all JSON to return an array of Equipment objects regardless of where they are in the hierarchy (which means you may need to filter it a bit better depending on your full JSON). Then it's pretty easy to define each field based on JSON path and just pass the result to CSVWriter.

Also check out some gotchas that I outlined in the respective comment lines.

void Main()
{
    var jsonString = "{\"GpsLocation\":{\"Equipment\":[{\"EquipmentId\":\"EQ00001\",\"InquiryValue\":[\"IV00001\"],\"Timestamp\":\"2020-01-01 01:01:01.01\"},{\"EquipmentId\":\"EQ00002\",\"InquiryValue\":[\"IV00002\"],\"Timestamp\":\"2020-01-01 01:01:01.01\"}]}}";
    var jsonReader = new StringReader(jsonString);
    var csvWriter = new StringWriter(); // outputs to string, comment out if you want file output
    //var csvWriter = new StreamWriter(".\\your_output.csv"); // writes to a file of your choice
    using (var csv = new ChoCSVWriter(csvWriter))
    using (var json = new ChoJSONReader(jsonReader)
                        .WithJSONPath("$..Equipment[*]", true) // firstly you scope the reader to all Equipment objects. take note of the second parameter. Apparently you need to pass true here as otherwise it just won't return anythig
                        .WithField("EquipmentId", jsonPath: "$.EquipmentId", isArray: false) // then you scope each field in the array to what you want it to be. Since you want scalar values, pass `isArray: false` for better predictability
                        .WithField("InquiryValue", jsonPath: "$.InquiryValue[0]", isArray: false) // since your InquiryValue is actually an array, you want to obtain first element here. if you don't do this, fields names and values would go askew
                        .WithField("Timestamp", jsonPath: "$.Timestamp", fieldType: typeof(DateTime), isArray: false)) // you can also supply field type, otherwise it seems to default to `string`
    {   
        csv.WithFirstLineHeader().Write(json);
    }
    Console.WriteLine(csvWriter.GetStringBuilder().ToString()); // comment this out if writing to file - you won't need it
}

Update summary:

  1. Pivoted to update the code to rely on JSON Path scoping - this seems to allow for field name manipulation with pretty low effort
  2. Looking at your comment, you could probably simplify your file writer a little bit - use StreamWriter instead of StringWriter - see updated code for example
timur
  • 14,239
  • 2
  • 11
  • 32
  • Awesome! Thank you very much. But one more thing, the code works great but it doesn't include the titles for each column (EquipmentId,InquiryValue,Timestamp). I'm saving the output to a CSV file using the code below. `string csvOutput = (csvWriter.GetStringBuilder().ToString()); File.WriteAllText("./foo.csv", csvOutput.ToString());` – Juan De La Cruz Aug 13 '20 at 16:15
1

Here is the working sample of producing CSV from your JSON

string json = @"{
""GpsLocation"": {
        ""Equipment"": [
            {
                ""EquipmentId"": ""EQ00001"",
                ""InquiryValue"": [
                    ""IV00001""
                ],
                ""Timestamp"": ""2020-02-01 01:01:01.01"",
            },
            {
                ""EquipmentId"": ""EQ00002"",
                ""InquiryValue"": [
                    ""IV00002""
                ],
                ""Timestamp"": ""2020-01-01 01:01:01.01""
            }
        ]
    }
}";

StringBuilder csv = new StringBuilder();

using (var r = ChoJSONReader.LoadText(json)
    .WithJSONPath("$.GpsLocation.Equipment")
    .WithField("EquipmentId")
    .WithField("InquiryValue", jsonPath: "InquiryValue[0]", fieldType: typeof(string))
    .WithField("Timestamp", fieldType: typeof(DateTime))
    )
{
    using (var w = new ChoCSVWriter(csv)
        .WithFirstLineHeader())
        w.Write(r);
}
Console.WriteLine(csv.ToString());

Output:

EquipmentId,InquiryValue,Timestamp
EQ00001,IV00001,2/1/2020 1:01:01 AM
EQ00002,IV00002,1/1/2020 1:01:01 AM

Sample fiddle: https://dotnetfiddle.net/hJWtqH

Cinchoo
  • 6,088
  • 2
  • 19
  • 34
0

Your code is sound, but the issue is that you're only writing the first variable in the array by using i.GpsLocation.Equipment[0]. Instead, try looping over everything by putting it into a for loop, and changing the [0] to your iterating variable inside of said loop.