5

I'm coding in C# for the .NET Framework 3.5.

I am trying to parse some Json to a JObject.

The Json is as follows:

{
    "TBox": {
        "Name": "SmallBox",
        "Length": 1,
        "Width": 1,
        "Height": 2 },
    "TBox": {
        "Name": "MedBox",
        "Length": 5,
        "Width": 10,
        "Height": 10 },
    "TBox": {
        "Name": "LargeBox",
        "Length": 20,
        "Width": 20,
        "Height": 10 }
}

When I try to parse this Json to a JObject, the JObject only knows about LargeBox. The information for SmallBox and MedBox is lost. Obviously this is because it is interpreting "TBox" as a property, and that property is being overwritten.

I am receiving this Json from a service that's coded in Delphi. I'm trying to create a C# proxy for that service. On the Delphi-side of things, the "TBox" is understood as the type of the object being returned. The inner properties ("Name", "Length", "Width", "Height") are then understood as regular properties.

I can serialize and deserialize a custom 'TBox' object that has Name, Length, Width, and Height properties. That's fine.

What I want to do is step through all the TBox sections in such a way as to extract the following three Json strings.

First:

{
    "Name": "SmallBox",
    "Length": 1,
    "Width": 1,
    "Height": 2 }

Second:

{
    "Name": "MedBox"
    "Length": 5,
    "Width": 10,
    "Height": 10 }

Third:

{
    "Name": "LargeBox"
    "Length": 20,
    "Width": 20,
    "Height": 10 }

Once I have these strings, I can serialize and deserialize to my heart's content.

I'm finding Newtonsoft.Json to be very good. I really don't want to go messing about with other frameworks if I can avoid it.

Any help would be greatly appreciated.

I have very limited input as to changes that can be made to the server.

Ubiquitous Che
  • 349
  • 2
  • 7
  • 20

2 Answers2

8
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;

JsonTextReader jsonReader = new JsonTextReader(reader);
jsonReader.Read();
while(jsonReader.Read())
{
    if(jsonReader.TokenType == JsonToken.StartObject)
    {
        JObject tbox = JObject.Load(jsonReader);
    }
}

However, note that the RFC says, "The names within an object SHOULD be unique" so if you can, recommend the format be changed.

EDIT: Here's an alternate design that doesn't have duplicate keys:

[
    {
        "TBox": {
            "Width": 1,
            "Length": 1,
            "Name": "SmallBox",
            "Height": 2
        }
    },
    {
        "TBox": {
            "Width": 10,
            "Length": 5,
            "Name": "MedBox",
            "Height": 10
        }
    },
    {
        "TBox": {
            "Width": 20,
            "Length": 20,
            "Name": "LargeBox",
            "Height": 10
        }
    }
]
Matthew Flaschen
  • 278,309
  • 50
  • 514
  • 539
  • Perfect! That was fast. Thanks. – Ubiquitous Che Oct 06 '10 at 23:24
  • Sent through an RFC link and had a chat with one of the senior developers. The word SHOULD has a particular meaning in RFC 2119, which basically means that the policy SHOULD be followed unless there's a strong enough reason to break it. In this case there is - the implementation on the server involves sending around lists of generic types. The specific type to be serialized/deserialized is used as the 'name' of the top-level of the Json. It's annoying for me to code against, but they're still within the RFC. – Ubiquitous Che Oct 07 '10 at 00:01
  • 2
    @Ubiquitous, I never said it violated the RFC. But it is unintuitive, and I don't necessarily agree this is a reason to break it. There are other designs that provide the type information without duplicate keys. – Matthew Flaschen Oct 07 '10 at 00:12
  • 1
    In this case I would interpret the SHOULD as a direction to JSON-parser-developers that they shouldn't throw an error when they encounter duplicate names. However, since the JS in JSON stands for "JavaScript" and JavaScript data structures CANNOT have duplicate names, it seems clear to me that your Delphi guys are violating the spirit, if not the letter, of the spec. The Newtonsoft behavior is exactly correct, because it's the same thing a JavaScript parser would do. – Joel Mueller Oct 07 '10 at 00:18
  • Ha! I'll pass it on. My argument was that the way they've done it won't work for most parsers in the wild. Their response was that was the fault of the parsers for not honoring the meaning of SHOULD correctly. ^_^ Thanks for the support and follow up, it's much appreciated. – Ubiquitous Che Oct 07 '10 at 00:33
  • As the implementer of the service in question I will say only this: The JSON *specification* allows duplicate names. If you do not allow duplicate names in your structures then what you have is not, according to it's own terms, JSON. I would also say that the service under development is embryonic. The duplicated values are currently not even used, but it is envisaged they may be. An alternate approach may be devised, but any alteration will be based on making the right decision for the server implementation, not simply to make lives easier for people wishing to use non-compliant parsers. – Deltics Oct 07 '10 at 00:58
  • @Matthew: you will have noticed that your alternate design introduces additional structure with NIL additional semantic contribution. It was this sort of increase in noise:signal ratio that JSON set out to alleviate us from with the likes of XML! The current, compliant, design is - in those terms - far more in keeping with the spirit of JSON, in addition to being entirely compliant with the letter of the specification! ;) – Deltics Oct 07 '10 at 01:01
  • 5
    An object in JSON is a name/value collection and this JSON is misusing the name. Json.NET's JObject in this example is doing exactly what a browser would do when provided with duplicate properties: uses the last value. The best solution is to change the JSON to either have a wrapper object with a property for the type name and a property for the value or add the type as a special property on the value object. – James Newton-King Oct 07 '10 at 01:05
  • The service is not for consumption by a web browser. I see little (no) point in writing it in a way for which it is not intended to be used when doing so is less efficient (not just in representation but in processing effort for the serialisation/deserialisation). Cup holders have no place in formula 1 cars. ;) If it violated the *specification* then you may have a point, but it doesn't. And that is the bottom line imho. I just don't see how "being right" be "wrong". – Deltics Oct 07 '10 at 01:09
  • I used the web browser as an example because that is the most popular use of JSON and it defines how people expect a JSON object to work: a name/value collection. You aren't breaking the specification but you are breaking user expectations. – James Newton-King Oct 07 '10 at 02:07
  • Thank you for telling us what our users will expect. It is fascinating to me that you know what they expect when you don't know anything about the application (beyond choosing to ignore what I have already told you). Just in case the ACTUAL use is of ANY relevance (it may not be to you, but as the person implementing the thing, it sure is to me), I shall point out that JSON is being used in this case as a lightweight mechanism for passing data between two applications, NEITHER of which will be a web browser and NEITHER of which will be pumping the data through a JavaScript parser or engine. – Deltics Oct 07 '10 at 03:22
  • 3
    Calm - I'm giving my opinion trying to help, not criticize you. Anyways, a user in this case is the developer. Developers are use to working with JSON objects that are name/value collections. The JSON home page - http://json.org/ - defines them as just that: a name/value collection. Again, you aren't breaking the spec but you are breaking user (developer) expectations. If you have a good reason to structure the JSON like you have then that is fine, just be aware that by doing something non-standard the consequence could be more questions for help like this one by the consumers of the JSON. – James Newton-King Oct 07 '10 at 10:04
  • Deltics - If you can find even one JSON parser not written at your company that supports duplicate names in a JSON name/value collection, I'll eat my hat. And if you tell me that means all JSON parsers don't comply with the spec, you're going to be laughed at... – Joel Mueller Oct 07 '10 at 16:03
  • @Joel - ANY JSON parser that doesn't throw an error when encountering duplicate names by definition SUPPORTS duplicate names. The Newtonsoft parser being used by my colleague is just such another example. Not only does it support duplicate names but it also allows code to WORK with JSON representations CONTAINING duplicate names. Do you want sauce with your hat? ROFL – Deltics Oct 07 '10 at 19:35
  • Deltics, James Newton-King's Newtonsoft parser does the same thing that JavaScript itself does - if it encounters duplicate keys, each successive duplicate key overwrites the previous value for that key. In the end, you have one key with one value. In this, it follows the spec, which does not require that multiple values be retained when duplicate keys are encountered. Matthew's workaround, above, involves working with the JSON token stream directly, specifically because of this issue. I'm afraid you're wrong on all counts. Nice try, though. – Joel Mueller Oct 07 '10 at 19:56
  • @Deltics "Any JSON parser that doesn't throw an error when encountering duplicate names by definition SUPPORTS duplicate names" Oh yeah? Do me a favor. Open up Firebug and run the following statement to parse your JSON and tell us all what happens. console.log(eval({ "TBox": { "Name": "SmallBox", "Length": 1, "Width": 1, "Height": 2 }, "TBox": { "Name": "MedBox", "Length": 5, "Width": 10, "Height": 10 }, "TBox": { "Name": "LargeBox", "Length": 20, "Width": 20, "Height": 10 } } )) – Shawn Grigson Oct 07 '10 at 20:25
  • As I thought. You guys are confusing "JSON Parser" with "JavaScript engine". That's like saying "XML Parser" when you mean ".NET Framework" in connection with an .appdata configuration file. Yes, it is XML, but it is what you are DOING WITH IT that determines whether a particular USE of it is correct. Considered separately from what TYPE of document it is, whether XML is valid or not is defined by the XML specification. The same applies here. The JSON being produced is valid. It's not compatible with JavaScript EXECUTION, but that's OK because that's not how it is being used. – Deltics Oct 08 '10 at 04:01
  • @Shawn: That isn't exercising a JSON parser, it's executing JavaScript code. If you don't understand the difference then it's pointless discussing what you see with you and trying to explain why it has not the slightest relevance to the issue at hand. – Deltics Oct 08 '10 at 04:05
  • @Joel: It does that if you use it for deserialising objects using the auto-deserialisation framework it provides, but if you treat it simply as a JSON PARSER it IS possible to access values with the same names as discrete values. I know you can, because I've seen it done (and I didn't even have to write the code myself). – Deltics Oct 08 '10 at 04:10
  • @Deltics: You have a talent for being aggravatingly obtuse. "That isn't a JSON parser, it's executing JavaScript code." JSON is JavaScript Object Notation! It's impossible to claim that JSON has nothing to do with JavaScript! And how do you 'parse' a string of JSON with JavaScript? With the eval() statement. There are Javascript frameworks, like ExtJS that do something like Ext.JSON.decode() but mostly what they do is call eval() under the covers. This is how JavaScript "parses" JSON. console.log() just outputs it so you can view the results. – Shawn Grigson Nov 04 '10 at 03:06
  • 1
    @Deltics: My point (which you are trying to ignore) is that JavaScript itself doesn't support multiple keys in JSON. You also claimed that if it doesn't throw an error that by nature it supports multiple keys, yet JavaScript itself doesn't support multiple keys, yet it doesn't throw an error, either. Claiming that JavaScript isn't really "parsing" the JSON is just semantic trickery. You have to execute code to parse something, and if you pass the JSON structure from a file to JavaScript, you'll need to "parse" it somehow--and when you do, you'll find it doesn't support duplicate keys. Kay? – Shawn Grigson Nov 04 '10 at 03:16
  • @Shawn - you have a talent for not reading the plain english put in front of you. JSON is being used here as a DATA FORMAT, not as executable source code. The data in the JSON is designed and intended to be output by Delphi code and read/parsed by a Delphi JSON parser and/or a C# parser. JavaScript itself is not involved anywhere anyhow. When used as a data interchange format not involving JavaScript, the requirements of JavaScript are utterly and completely IRRELEVANT, which I imagine is why the SPECIFICATION does NOT make it mandatory that JSON data conform entirely to JavaScript syntax. – Deltics Nov 05 '10 at 03:03
  • 1
    @Deltics - *JavaScript* Object Notation=JSON - If, used as a data format, it cannot be decoded properly into native Javascript objects, then it is not JSON. Period. Has nothing to do with whether or not JavaScript will ever touch it. I never claimed that your JSON needed to be executable source code. What I'm trying to point out is that your JSON is invalid in a JavaScript context, which means the first two letters of the acronym are broken in your implementation. Maybe you're using "ON" as a data format, or "DON" (Delphi Object Notation). You're not, however, using "JSON". – Shawn Grigson Nov 11 '10 at 19:32
  • @Deltics - Moreover, you said "ANY JSON parser that doesn't throw an error when encountering duplicate names by definition SUPPORTS duplicate names." JavaScript doesn't throw an error when it *parses* (ie., not executes, just parses) JSON with duplicate names. But neither does it support duplicate names. So I'm calling BS on this statement. I'm sure you'll say that because I used 'eval()' in my example that JavaScript is 'executing' the JSON, which is semantics only. json.org explains that you use eval() to convert a JSON text into an object. http://www.json.org/js.html (continued) – Shawn Grigson Nov 11 '10 at 19:39
  • 1
    @Deletics - Take special note: "Since JSON is a proper subset of JavaScript, the compiler will correctly parse the text and produce an object structure." Get the part about where it says "the compiler will *correctly* parse the text"? Guess what, it does. It's your parser that doesn't do it *correctly*. Yet you claim that anything that doesn't throw an error when encountering duplicate names supports duplicate names, and this is just not the case. JSON is a proper subset of JavaScript. JavaScript doesn't support duplicate names. You're doing it wrong. – Shawn Grigson Nov 11 '10 at 19:42
  • @Deltics - When writing a parser in .NET or any other language for JSON, you're often going to pass data to/from JavaScript. You'll output data designed for consumption by JavaScript, or you'll get JavaScript to serialize the data so you can read it. You'll want your parser to *exactly* render things in your native language that matches how they will be handled by JavaScript. Which it does in this case. What are you suggesting? That the NewtonSoft parser should behave according to your intepretation of the spec, rather than how it's handled by JavaScript? – Shawn Grigson Nov 11 '10 at 19:46
  • I can't believe this conversation is still going on, but what I think Shawn is trying to say is this: If the spec for JavaScript Object Notation is ambiguous in any way, you resolve the ambiguity by looking at what JavaScript parsers do. Not by being a self-important pedant. – Joel Mueller Nov 11 '10 at 19:51
  • @Joel: The specification for JSON is entirely UNambiguous. It allows for duplicate names. It does NOT require uniqueness and does NOT refer to them as key/value pairs, but NAME value pairs. Shawn is flat out wrong. Other parsers, including .NET ones and including the Newtonsoft one that he references, DO support the retrieval of multiple objects with duplicate names. Our parser is not unique in this respect and is entirely compliant with the specification. Shawn presumably also thinks that MP3 should not contain audio since the "MP" stand for MOVING PICTURE, despite what the spec says. – Deltics Nov 12 '10 at 03:15
  • @Shawn and @Joel: The final word from me: The JSON specification describes the valid content and structure of a JSON text data structure. It specifically PERMITS duplicated value names in a given set of name/value pairs. The JSON specification makes no comment and demands no specific behaviour from a JSON *parser*, though it is reasonable to conclude that since duplicate names are permitted, that a parser should accomodate this. .. continued.. – Deltics Nov 12 '10 at 03:24
  • ... At least one alternate parser quoted back to me as examples of how a parser *should* behave, ALSO supports duplicate names, exactly as we do. What is clear therefore is that there is confusion and misunderstanding about what is and is not "correct" in this area (and indeed, about what capabilities other implementations support and offer), but that confusion and misunderstanding is equally clearly NOT on *my* part. – Deltics Nov 12 '10 at 03:26
  • 1
    @Deltics - The spec says that names SHOULD be unique. Yes, I am aware of the difference between SHOULD and MUST. The spec does not say what parsers should do with duplicate keys - therein lies the ambiguity I was referring to. What does the canonical JSON parser, JavaScript, do with duplicates? It discards them. So does Newtonsoft, unless you dip down to the level of the token stream, which nobody should be required to do just to read your values. In any case, I'm really pleased that I'll never have to work you. The OP has my sympathies. – Joel Mueller Nov 15 '10 at 17:38
  • 1
    @Deltics - Repetition might work, I guess. "Since JSON is a proper subset of JavaScript, the compiler will *correctly* parse the text and produce an object structure." Your stuff isn't a proper subset of JavaScript, therefore it's not JSON. The JavaScript compiler *correctly* parses the text, yet you claimed that anything that doesn't error out supports duplicate keys, yet JavaScript doesn't do either of these. I can tell that getting you to admit you're wrong is impossible, but hopefully your frequent dodges are obvious enough to anyone coming here wanting to know how to do it correctly. – Shawn Grigson Nov 15 '10 at 17:53
  • @Joel - the OP sits right behind me, and with 10 minutes work managed to get at the values using Newtonsoft without, afaik, having to hack around with the token stream. JavaScript is NOT A JSON Parser. It is able to *execute* JSON as a side effect of the fact that JSON is designed to be executable as JavaScript. JSON allows duplicates names, so my component does too. It doesn't enforce a rule that does not exist, but neither will it FORCE anyone to BREAK any rules. – Deltics Nov 16 '10 at 20:30
  • My own repetition may work: The SPECIFICATION allows duplicates names - you may not like that, but it's a simple FACT. My *parser* therefore supports and allows duplicate names. Whether any application using my component chooses to USE this aspect of the specification is then determined by the intended and expected use of the output. If it were to be executed and interpreted by JavaScript then you would not duplicate names. But that is NOT the intended and expected use in this case, and so my *application* is free to use ALL features of my 100% compliant parser. END. FINAL. – Deltics Nov 16 '10 at 21:02
  • 2
    Wow. I guess you showed me with that "END. FINAL." thing. Good for you. I suppose that json.org saying that Javascript *parses* the JSON text is just a figure of speech. if (language=="JavaScript") parser = false? I suppose that JavaScript not throwing errors when duplicate names are encountered doesn't invalidate your statement about parsers, because JavaScript never parses anything. Good job of redefinition there. I bow before the master. However, if you want to shut people up to disagree with you, you forgot an important phrase: END. FINAL. TIMES INFINITY. *fingers in ears* Lalala.... – Shawn Grigson Nov 18 '10 at 14:35
4

If I'm not mistaken, the correct answer to this is that your input is not actually JSON. So no, getting a JSON parser to parse it probably isn't going to work.

Maybe you don't have any control over the source of the input, so I'd use a Regex or something to pre-filter the string. Turn it into something like:

{"TBoxes":
    [
        {
            "Name": "SmallBox",
            "Length": 1,
            "Width": 1,
            "Height": 2 
        },
        {
            "Name": "MedBox",
            "Length": 5,
            "Width": 10,
            "Height": 10 
        },
        {
            "Name": "LargeBox",
            "Length": 20,
            "Width": 20,
            "Height": 10 
        }
    ]
}

And treat it like the array that it is.

Mike Ruhlin
  • 3,546
  • 2
  • 21
  • 31
  • I had a go at Regex, and it was proving difficult. In some of the real-world scenarios I'm going to have to handle objects within objects within objects. I tried it out just to be sure, and handling the nested curly-braces and the possible character sets quickly became a pain in the ass. I'm normally a huge fan of regex, but in this case I was hoping for something easier. – Ubiquitous Che Oct 06 '10 at 23:37
  • Turns out that they are *technically* still sending valid Json. See my second comment to Matthew above. It's annoying on my end, but I can handle it now. – Ubiquitous Che Oct 07 '10 at 00:02
  • 2
    No, it's really not correct JSON. You're misinterpreting the spec, which is "based on a subset of the JavaScript Programming Language, Standard ECMA-262 3rd Edition - December 1999." There are no JavaScript parsers that will retain any but the very last value associated with multiple duplicate keys. Go ahead and tell Douglas Crockford, the inventor of JSON, that he's wrong. – Joel Mueller Oct 07 '10 at 16:14
  • @Joel: Wrong. There are at least TWO. The one I wrote and the one provided by Newtonsoft. Just because many parsers make the same invalid assumptions you do does not mean they are wrong. They correctly implement the spec. You don't have to like it, but persisting in the view that something that complies with the *specification* is wrong is just plain stupid. What you get when run "execute" the JS in a JSON object is not relevant to the question of what is a correct JSON structure, especially if the JSON structure is not intended to be executed as JS and is merely a data transport. End. – Deltics Oct 07 '10 at 19:39
  • I am correctly interpreting the spec - you are the one incorrectly interpreting it. The spec states that names SHOULD be unique. If it intended that names HAD TO BE unique then it would say that they MUST be unique. The spec clearly and deliberately uses the term SHOULD, not MUST, and refers to RFC 2119 which defines those terms. An interpretation which reads SHOULD as MUST is an incorrect interpretation. Fact. – Deltics Oct 07 '10 at 19:42
  • @Deltics - a parser that doesn't throw an error when it encounters duplicate keys, and also ends up producing an object that contains only one value per key, is following the spec. I am speaking, in this case, of all JSON parsers save yours, including Newtonsoft. If you want to keep using your own unique definition of the term "key-value pairs" by all means go ahead. But don't pretend that everyone else should rewrite their parsers to cater to you. But don't take my word for it. Here's a Delphi JSON parser. See what it does. http://goo.gl/UoKQ – Joel Mueller Oct 07 '10 at 20:03
  • I'm not pretending anything. I'm following the spec, not some imagined document that I think is a spec. shrug. – Deltics Oct 08 '10 at 03:59
  • I am not using my own definition of "name-value" pairs NOTE: **NAME**/value, *NOT* **key**/value. The word "key" occurs only twice, both in connection with defining the difference between MUST and SHOULD. You have read the spec? Haven't you? Even if you have, clearly you are using your own very unique definition of the word "specification" where what the specification says isn't what the specification means. – Deltics Oct 08 '10 at 04:15