3

Is there a built-in way (or a trick) for parsing only valid objects and ignoring invalid ones?


Not a duplicate

The question Ignoring an invalid field when deserializing json in Json.Net does not answer my question because it's about a custom serializer for a very specific field of date-time type. I'm seeking a generic solution working for any property and any object.

In other words if anything is invalid, just ignore it and contintue to the next entry. As far as json is concerned, the file is correct but the content might not match the expected types in some places. It can by anything.


Background: The file contains an array of many workflows and a single damaged entry should not break the entire configuration and virtually disable them all.


Here's an example demonstrating what I mean. Let's say I have an array of Users but one entry instead of using a string for the Name uses an array (it might by any combination of invalid values, like an object where an array is expected.

I'd like to deserialize this array and ignore entries that couldn't be deserialized. This means that the expected result should be two users, John & Tom.

I tried to use the Error handler but it does not work this way. It doesn't allow me to skip the errors.

void Main()
{
    var json = @"
    [
        {
            'Name': 'John',
        },
        {
            'Name': [ 'John' ]
        },
        {
            'Name': 'Tom',
        },
    ]   
    ";

    var users = JsonConvert.DeserializeObject<IEnumerable<User>>(json, new JsonSerializerSettings
    {
        Error = (sender, e) =>
        {
            e.Dump();
            e.ErrorContext.Handled = true;
            e.CurrentObject.Dump();
        }
    }).Dump();
}

class User
{
    public string Name { get; set; }
}
t3chb0t
  • 16,340
  • 13
  • 78
  • 118

3 Answers3

5

I solved this way. Not elegant.

var users = JsonConvert.DeserializeObject<IEnumerable<Object>>(json);

var usersList = users.ToList().Select(x => 
                {
                    try { return JsonConvert.DeserializeObject<User>(JsonConvert.SerializeObject(x)); } catch { return null; }
                }
            ).Where(x=> x != null).ToList<User>();
1

A good example of how I approached a similar situation would be to have a set of different JsonSerializerSettings configuration specifically within a try/catch block.

For example:

JsonSerializerSettings jsonSetting = new JsonSerializerSettings { MissingMemberHandling = MissingMemberHandling.Ignore };

The above code block could be done within the catch section after you try and failed to complete the parse of the JSON.

Your try block could have normal error behavior inside that block above:

jsonSetting = new JsonSerializerSettings { MissingMemberHandling = MissingMemberHandling.Error };

Provides similar error handling, and when that errors, it drops into the catch block to ignore the missing fields.

However, that depends on whether you're OK with skipping data, or if you want to parse everything. It really is a case-by-case basis dependent on your actual JSON dataset.

Placing this above the try/catch blocks, and using jsonSetting to pass along as needed, depending on your specific dataset.

EDIT: Just for emphasis, I do want to point out that the sample you provided isn't a perfect route to approach using this method, but it did allow me to skip arrays that were null, or had invalid data in my case. It really depends on your dataset, but this may be a useful route to at least pursue or consider.

gravity
  • 2,175
  • 2
  • 26
  • 34
  • I'm not sure I got this correctly. Do you mean I should first tokenize the array and parse each array entry separately? Adding the `MissingMemberHandling` in this case didn't help. – t3chb0t May 10 '19 at 16:48
  • Yeah, that speaks to why I made the edit to the answer to some degree. Your sample expects a string, but the parser is trying to shove in an array instead. Consider `try`/`catch` specifically to your sample, and you'd want an `try`/`catch` but for the error handling instead. Is your sample data a **perfect representation** (array instead of expected string) of the entirety of all of the dataset issues? – gravity May 10 '19 at 16:52
  • Oh, you mean I should put the `try/catch` inside the `Error` handler? This is the most common error that occurs in 8 of 10 times where I use an array instead of a value or an object but since these files are maintained by hand, any mistake is possible so I just wanted to ignore them, I'm OK with skipping anything that is invalid. The most important thing is that a single mistake shouldn't break everything. – t3chb0t May 10 '19 at 16:56
  • Ouch. This is what I feared. Technically speaking this is a *"fix your dataset,"* scenario. `Name` would always need to be an array of strings first, even if there was just a single value. That would be the absolute correct way to do it. Otherwise, take a look [at the third answer](https://stackoverflow.com/a/39600660/2486496) in the previously mentioned possible duplicate. It's a custom deserializer that nearly fits what you're looking for. It's just not a good idea to work around poorly formatted data. – gravity May 10 '19 at 16:58
  • Mhmm... I'll try it out... maybe after all it is a duplicate! ;-] You're absolutely right, the data should be well formatted but sometimes this kind of a _workaround_ can save you from a lot of trouble when only a small piece doesn't work and not the entire application. I would of course log it and raise an alert but at least it would still be doing something ;-) – t3chb0t May 10 '19 at 17:03
1

I know this question is a bit older, but just for reference:

You can use JsonSerializerSettings to handle parse errors. This allows you to log any error found and, in this case, mark such errors as handled, so the parsing continues:

new JsonSerializerSettings
{
    Error = delegate(object sender, ErrorEventArgs args)
    {
        args.ErrorContext.Handled = true; 
    }
}
igece
  • 341
  • 1
  • 3
  • 12