1

I have this XML element:

<address>123 main street Chicago IL 60605</address>

Now let's say I want to add <country> element to it (and I insistent on country being an element not attribute), what would be right way to do that?

I can imagine it like this:

<address>
<country>USA</country>
123 main street Chicago IL 60605
</address>

But the actual address above now looks somewhat tagless although I believe it's still a valid XML right?

I am asking because I haven't seen any example like above (in terms of learning XML format) where an element contains a child element but also an inline value as well.

And I know country being an attribute here would make a lot more sense but my question is about this way of format. If that's legal and acceptable or the actually address should be enclosed in another tag in later example like <street_address>?

kjhughes
  • 106,133
  • 27
  • 181
  • 240
zar
  • 11,361
  • 14
  • 96
  • 178

2 Answers2

1

Can an XML element have both text and element content?

Yes, it's called mixed content.

Now let's say I want to add <country> element to it (and I insistent on country being an element not attribute), what would be right way to do that?

It is up to you, as a designer of an XML format, to decide.

I can imagine it like this:

<address>
<country>USA</country>
123 main street Chicago IL 60605
</address>

But the actual address above now looks somewhat tagless although I believe it's still a valid XML right?

Terminology clarification: What you imagine is well-formed XML. Whether or not it is valid depends on the schema you design or adopt. For further information on the distinction, see Is there any difference between 'valid xml' and 'well formed xml'?

As written above, the address element is said to have mixed content. Such a form is more common for document-oriented designs than data-oriented designs, but the XML is well-formed in either case. If you're designing a schema, you'd probably want to markup the other components of an address,

<address>
  <street>123 main street</street>
  <city>Chicago</city>
  <state>IL</state>
  <zip>60605</zip>
<address>

rather than using mixed content, which as mentioned is useful for documents:

<p>This is an example of <i>mixed content</i></p>
kjhughes
  • 106,133
  • 27
  • 181
  • 240
  • Thanks that's what I was looking for, specially to know the terminology as well. I used the term inline but that's probably not the right way to describe it I guess? – zar Feb 11 '21 at 04:05
  • The term *inline* along with your example were sufficient to convey your question, but right, using standard terminology, once known, is ideal. You're welcome. – kjhughes Feb 11 '21 at 04:11
0

As a design convention, it's best to use mixed content only where the text still makes sense if you remove the markup:

<p>That is <emph>so</emph> interesting!</p>

or

    <bibref><authors><author format="surname, initials">Kay M. H.</author
></authors>: <title>XSLT Programmer's Reference</title
>. 4th edition, <publisher>Wiley</publisher>, <year>2010</year>.</bibref>

Some XML processing APIs only make sense if you follow this convention, for example XPath: contains(bibref, "Wiley, 2010"). (This query works whether or not the bibref has been "marked up".)

But people can and do use XML mixed content in other ways. Mixed content is of course the biggest difference between XML and JSON, and reflects the fact that the M in XML means "markup": it was devised as a system for labelling parts of a textual document.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164