20

When I first saw XML, I thought it was basically a representation of trees. Then I thought: the important thing isn't that it's a particularly good representation of trees, but that it is one that everyone agrees on. Just like ASCII. And once established, it's hard to displace due to network effects. The new alternative would have to be much better (maybe 10 times better) to displace it. Of course, ASCII has been (mostly) replaced by Unicode, for internationalization.

According to google trends, XML has a x43 lead, but is declining - while JSON grows.

[edited] How and why will JSON replace XML as a data format?

  1. for which tasks?
  2. for which programmers/industries?

NOTES: S-expressions (from lisp) are another representation of trees, but which has not gained mainstream adoption. There are many, many other proposals, such as YAML and Protocol Buffers (for binary formats).

I can see JSON dominating the space of communicating with client-side AJAX (AJAJ?), and this possibly could back-spread into other systems transitively.

XML, being based on SGML, is better than JSON as a document format. I'm interested in XML as a data format.

XML has an established ecosystem that JSON lacks, especially ways of defining formats (XML Schema) and transforming them (XSLT). XML also has many other standards, esp for web services - but their weight and complexity can arguably count against XML, and make people want a fresh start (similar to "web services" beginning as a fresh start over CORBA).

[edited Mar2010] Like NoSQL, JSON is schemaless.

13ren
  • 11,887
  • 9
  • 47
  • 64
  • I think google trends is on leave. All trend links are returning error (as of 12:37am est) – Learning Mar 26 '09 at 04:37
  • 1
    +1 for pointing the difference between document and data formats. – Javier Mar 26 '09 at 05:28
  • @Learning: I think I messed up the URL somehow. I've edited, and it seems to work now. – 13ren Mar 26 '09 at 10:44
  • Looks like ratio of google mention is down a lot, now more like 6:1... kind of high still, but going down. – StaxMan Feb 03 '11 at 23:49
  • thanks for checking up! According to figures in the top left of google trends, it's 24:1. BTW: I checked out JSON schema recently, and it seems to be duplicating the XML Schema/XSLT approach, by doing it in JSON itself (that lispy/self-hosting idea). I think adding meta-layers like this makes it harder to understand; and it's better to use a distinct syntax for grammar (though DTD did this, I don't like their choice of alternative syntax). – 13ren Feb 04 '11 at 12:42
  • I cannot answer this question. But, I am also hoping that someone will mention some COMPELLING reasons to switch from XML to JSON. If there are no significant advantages in terms of ease of development, smaller size etc, then there is no reason to switch. Only legacy code might continue to use it. New systems will probably use JSON without even considering XML. – Apple Grinder Dec 09 '12 at 12:23
  • 1
    @AppleGrinder JSON is simpler and more similar to programming languages in both semantics and syntax; but it lacks XML's ecosystem of XSD (static type validation), XSLT (transformation), namespacing etc. Proposals to add them to JSON are rejected, because it would then become as complicated as XML. This creates clear domains for each of them (tool for the job), with JSON favoured for communication to JS when validation/transformation tools aren't needed. Not compelling though. Aside: JS promotes JSON; but now native mobile apps are a driving force (Objective C, Java). – 13ren Dec 09 '12 at 16:13

4 Answers4

15

Short answer: yes and no (EDITED as per comments below)

There are fundamental differences and trade-offs. XML is a markup language, particularly suitable for textual documents (xhtml, docbook, various kinds of office docs). And good enough for many other tasks. Problems mostly arise for it having hierarchic model (instead of, relational as in SQL, or object-graph as in oo languages).

JSON is an object notation, meaning it has bit more natural fit for handling data-oriented use cases; cases where xml sort of works, but where there is more cost in overcoming impedance between object and hierarchic models. JSON is not a perfect fit -- it's still data, not objects (no identity, can't do full graphs) -- but it is more natural than XML. And as such, it is easier to build tools to do good decent and simple data binding.

So: there's plenty room for both, and I would expect both to be used for long time to come. Not always in optimal way, but both can do plenty of use cases well enough.

For what it is worth, since writing my original answer, I have seen JSON absolutely annihilate XML for data-oriented/data-interchange use cases for companies I have worked for. SOAP (etc) will start significantly shrinking, and "plain old JSON" data interchange (esp. with RESTish frameworks, JAX-RS for Java for example) will take over.

And yet XML is much better for textual markup.

StaxMan
  • 113,358
  • 34
  • 211
  • 239
  • Thanks for your answer. Please see buried in the question I say "I'm interested in XML as a data format." I really mean to be asking will JSON replace XML as a data format? You seem to be saying "yes". (title edited) – 13ren Apr 01 '09 at 04:28
  • 2
    Sorry I missed that. Yes, I do think JSON is bit better, and over time will handle more of those use cases than XML. I don't think it will replace XML, for variety of reasons however. But I think it will become more important. – StaxMan Apr 01 '09 at 17:35
  • JSON can be used for markup: http://www.ibm.com/developerworks/library/x-jsonml/ check it out. In my opinion it looks horrible! – Tjaart Jul 27 '12 at 07:43
  • Right, just as xml can be used for data interchange... right tool for the job etc; not surprised at all that it looks horrible. – StaxMan Jul 27 '12 at 17:36
11

My bold thesis is that such replacement is impossible after all, since these data-formats (JSON and XML) are different.

Short version: XML is not equivalent to JSON (or similar) format since XML nodes (tags) support attribute notation and namespacing. It turns out to be crucial.

So, the best way to answer this question is actually to show how these formats are different, i.e. to complete the comparison. Forgive me for stating the obvious but I only hope this will be interesting or even useful. It will help if we first agree with simple terminology that:

  1. Data-format is actually a formal language, which governs how data can be recognized (in its representation, i.e. how to "read/write" it from memory according to the way it is stored there).
  2. Data-structure is an abstract way of modeling (describing) how this data is organized or linked.

So, actually both concepts address different aspects of data maintenance (e.g. IO). For example, indexed array of a particular data-type is a (homogenues) structure and it can be accessed (read/written) as a serial sequence (contiguous format).

Wikipedia has a great article about JSON containing a lot of alternatives like (already named lisp's) S-Expressions, Python Nested Structures, PHP arrays, YAML, etc (note we are not considering dictionaries like .ini files since they lack multiple nesting). All these formats can be seen as representation of a certain data-structure - a tree. We can state that they are isomorphic in that sense. Each representation can be mapped to a tree in such manner that no extra processing should be done (e.g. grammar of a formal language is not changed). Also there exists a reverse mapping.

Well you may say that's "some" theory but what does it mean for practice? Implications are that if we compare XML and JSON by:

  • design purpose and motivation
  • application domain - set of task a format is used to solve
  • syntactical complexity (well, simplicity - to which extend format is more readable/writable/human friendly/etc)
  • maturity (like how many versions the format is around)
  • and so on

we will discover further practical differences. Major of them all is that XML is a MARKUP language (as been mentioned). Yes, to do folding it is able to mix namespaces and attributes which results in a higher-order of "parallel" nesting.

For the past two years I was busy transforming XML representation into python nested structures back and forth. To my only bitter conclusion they are very poorly compatible. To represent attributes and namespacing one should escape (e.g. with prefixes) this information in the tree representation. So once again XML is definitely not a tree ;-) it immediately (without the need to encode, encapsulate or escape) allows representation of much more sophisticated structures than trees due to "markup" capabilities, i.e. typed trees. Trees with specialized types of vertexes (again by namespaces and attributes).

There are other difficulties and dangers like parsing and mapping

<body>The <strong>marked up</strong> text</body>

into a tree without some pre-decided convention (How to break "The .. text"?) or preserving order followed in XML.

Obviously things which are not equivalent are naturally having trouble to substitute each other. In that sense XML is more complex than nested structures.

The part of the question regarding industries seems pretty well answered by a prognoses that XML will stay server-side and document-oriented technology. Mainly because of its superior data-typing abilities. Also there have been done a lot of research motivated by XML solely as a markup language.

Excuse me for being far off the topic further, discussing the popularity of JSON but it seems partially relevant ;)

I want to emphasize that JSON (being an object notation) completely fails to grasp any of the custom typing information (it enumerates the type without providing a "runtime"-reference or a context) by design (it is JavaScript), hence fails to pass highly-coupled objectified data. Type information will be always abstracted to JSON native types. This limits the abilities for type oriented development (type checking, constraining, casting, delegation, etc.). But IMHO this very crucial problem is shared with JSON by the most of modern programming languages (I know), which lack sophisticated nested custom data-typing as XML does (objects or functions are not documents). It seems that XML itself is doing this only by accident and not by design.

As the result while working with JSON one applies similar tactics as by processing "duck"- typed data in popular dynamical languages. So this is another characteristics for JSON - allows fast coding but risks to get bulky when is growing too big (nested and complex).

JSON is more of a swiss-knife than XML since it is simpler.

So, JSON does not help to interoperate with strongly-typed languages like Java but on the other hand it allows to lower the coupling by encouraging abstract decomposition. Since losing type information sometimes may be a good thing (reduction factor) it allows simpler architectures. ActionScript prefers to communicate de-facto in JSON (but they have also proposed own AMF). Finally, JSON works great with KISS (e.g. RESTful) designs. JSON buys with speed and simplicity. But what one usually tends to ignore is when KISS is impossible and domain logic is too complicated - designing DTDs and XSDs, thinking formats through and so on - is the work that should be done by someone (often later on when cool KISS approach failed because of lack of designing competence and experience). The point is JSON is a great tool which lacks application scale.

Yauhen Yakimovich
  • 13,635
  • 8
  • 60
  • 67
  • 1
    I suspect that your meant **poorly**, when you wrote _To my only bitter conclusion they are very **purely** compatible._ – R. Schreurs Jan 06 '22 at 11:50
7

I think JSON has already largely replaced XML for client-side communications with a web server, but that will likely be the extent of its dominance. As you stated, XML provides advantages that are appropriate for server-to-server interactions.

Beep beep
  • 18,873
  • 12
  • 63
  • 78
  • 7
    JSON has already largely replaced XML? Sure? XSLT works with JSON? DOM works with JSON? Java and .NET XML APIs work all with JSON as good as with XML? Did I miss something? ;) – ivan_ivanovich_ivanoff Mar 29 '09 at 12:22
  • 2
    Yes, JSON is rapidly replacing XML for RPC-style communication: not just AJAX from browser, but server to server. Major web 2.0 companies are doing it, and I was personally very surprised to see how fast it has happened at company I just joined. XSLT is not used (for good reason) for data-oriented use cases; DOM likewise. Data is best consumed as bound objects. Put another way: SOAP-style communication has become a legacy thing, becomes "Corba of 2000s". XML still has its place for markup (Docbook, office etc) of course. – StaxMan Mar 30 '10 at 22:09
  • 1
    XML provides advantages that are appropriate for server-to-server interactions - what are these advantages ? – Apple Grinder Dec 09 '12 at 12:18
3

Replace XML? Which XML?

There is "XML - the kind of data structuring" and "XML - the the textual representation of this structuring".

So, while the textual representation of XML can be replaced by many means (JSON, YAML, ...), it would not replace the structural properties (there's a tree, elements with attributes, sub-elements and text nodes).

There are formats which store and/or process XML-structured data while neglecting the textual form. Examples:

  1. DOM - stores an object tree in memory in an transformation-efficient form.
  2. EXI - future format to store/transmit XML data in binary-optimized form.

So, textual representation of XML can be "replaced" by transforming the standard XML notation to something else and back again. (XML to JSON, and back to XML)

But, the structural properties and all technologies based on them, can not be "replaced", because this would just break all standards. So no one is doing this. There are just alternative textual representations being read to in-memory DOM or other formats, achieving a higher level of abstraction thus neglecting the underlying textual form.

ivan_ivanovich_ivanoff
  • 19,113
  • 27
  • 81
  • 100