107

Is there any standard, de facto or otherwise, for XML documents? For example which is the "best" way to write a tag?

<MyTag />
<myTag />
<mytag />
<my-tag />
<my_tag />

Likewise if I have an enumerated value for an attribute which is better

<myTag attribute="value one"/>
<myTag attribute="ValueOne"/>
<myTag attribute="value-one"/>
Greg Mattes
  • 33,090
  • 15
  • 73
  • 105
tpower
  • 56,100
  • 19
  • 68
  • 100

13 Answers13

52

I suspect the most common values would be camelCased - i.e.

<myTag someAttribute="someValue"/>

In particular, the spaces cause a few glitches if mixed with code-generators (i.e. to [de]serialize xml to objects), since not many languages allow enums with spaces (demanding a mapping between the two).

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • 38
    Hm ... best answer ... I think it is a decent answer, but it is just an opinion. Having some sort of reference would be nice. – Hamish Grubijan Jul 18 '10 at 22:53
  • 6
    I don't agree, I'm not used to see XML with camel case. – Rafa Mar 06 '13 at 12:31
  • I know this is an old answer, but most of the *newer* Microsoft XML I have seen tends to disagree with this format choice. But then IIS likes dot.naming so .. – user2246674 Apr 29 '13 at 18:43
  • 4
    As everyone mentions it's personal, but i follow your approach as i always define my XML using XMLSchema, and XMLSchema follows this approach. http://www.w3.org/2001/XMLSchema.xsd . For me it has nothing to do with programming languages. We use XML because it's an inter-operable interface standard. Programming languages are just an implementation detail and each language has it's own convention. – dan carter Jul 30 '13 at 03:26
  • My 2 cents - I have seen CamelCase, and all lowercase; rarely all upper (old HTML), and I've seen lower-case. I can't recall ever seeing camelBack. I prefer CamelCase or lowercase. Attributes, however, I tend to see all lowercase. – Kit10 Jan 08 '15 at 17:26
  • @Copperpot you mean ProperCase, right? – Marc Gravell Jan 08 '15 at 19:03
  • @MarcGravell I hadn't heard the term "ProperCase". Seems like it's associated with Visual Basic (a language I haven't used in over a decade, besides VBA). This is also called "title case" in Python - interesting! Thanks for mentioning it. Found them here: http://en.wikipedia.org/wiki/Letter_case. My definition of terms falls in line (just by happenstance) with: http://en.wikipedia.org/wiki/CamelCase – Kit10 Jan 28 '15 at 17:57
  • @Hamish : I just added an answer with reference. – Jeson Martajaya Mar 17 '15 at 23:45
39

XML Naming Rules

XML elements must follow these naming rules:

    - Element names are case-sensitive 
    - Element names must start with a letter or underscore
    - Element names cannot start with the letters xml(or XML, or Xml, etc) 
    - Element names can contain letters, digits, hyphens, underscores, and periods 
    - Element names cannot contain spaces

Any name can be used, no words are reserved (except xml).

Best Naming Practices

    - Create descriptive names, like this: <person>, <firstname>, <lastname>.
    - Create short and simple names, like this: <book_title> not like this: <the_title_of_the_book>.
    - Avoid "-". If you name something "first-name", some software may think you want to subtract "name" from "first".
    - Avoid ".". If you name something "first.name", some software may think that "name" is a property of the object "first".
    - Avoid ":". Colons are reserved for namespaces (more later).
    - Non-English letters like éòá are perfectly legal in XML, but watch out for problems if your software doesn't support them.

Naming Styles

There are no naming styles defined for XML elements. But here are some commonly used:

    - Lower case    <firstname> All letters lower case
    - Upper case    <FIRSTNAME> All letters upper case
    - Underscore    <first_name>    Underscore separates words
    - Pascal case   <FirstName> Uppercase first letter in each word
    - Camel case    <firstName> Uppercase first letter in each word except the first

reference http://www.w3schools.com/xml/xml_elements.asp

Farhad Maleki
  • 3,451
  • 1
  • 25
  • 20
15

I favour TitleCase for element names, and camelCase for attributes. No spaces for either.

<AnElement anAttribute="Some Value"/>

As an aside, I did a quick search for Best Practices in XML, and came up with this rather interesting link: XML schemas: Best Practices.

Raithlin
  • 1,764
  • 10
  • 19
14

For me, it is like discussing of code style for a programming language: some will argue for a style, others will defend an alternative. The only consensus I saw is: "Choose one style and be consistent"!

I just note that lot of XML dialects just use lowercase names (SVG, Ant, XHTML...).

I don't get the "no spaces in attributes values" rule. Somehow, it sends to the debate "what to put in attributes and what to put as text?".
Maybe these are not the best examples, but there are some well known XML formats using spaces in attributes:

  • XHTML, particularly class attribute (you can put two or more classes) and of course alt and title attributes.
  • SVG, with for example the d attribute of the path tag.
  • Both with style attribute...

I don't fully understand the arguments against the practice (seem to apply to some usages only) but it is legal at least, and quite widely used. With drawbacks, apparently.

Oh, and you don't need a space before the auto-closing slash. :-)

PhiLho
  • 40,535
  • 6
  • 96
  • 134
  • The argument against spaces is, and this is only because it was asked specifically in the question, if the value is enumerated then to support parsing, not many languages support enumerations with spaces, yet many of us who use XML in C/C++, C#, or Java (languages I use, but not limited to) will often map attribute values to enumerations. We can then simply parse a literal to a map/dictionary (or easier in the case of Java and C#). Ultimately I agree that it appears to be a matter of fervor rather than a standard. I simply follow the "when in Rome" philosophy. – Kit10 Jan 08 '15 at 17:33
8

I would tend to favour lowercase or camelcase tags and since attributes should typically reflect data values - not content - I would stick to a value which could be used as a variable name in whatever platform/language might be interested, i.e. avoid spaces but the other two forms could be ok

annakata
  • 74,572
  • 17
  • 113
  • 180
8

It's subjective, but if there are two words in an element tag, the readibility can be enhanced by adding an underscore between words (e.g. <my_tag>) instead of using no separator. Reference: http://www.w3schools.com/xml/xml_elements.asp. So according to w3schools the answer would be:

<my_tag attribute="some value">

The value needn't use an underscore or separator, since you are allowed spaces in attribute values but not in element tag names.

Rory O'Kane
  • 29,210
  • 11
  • 96
  • 131
alistair
  • 1,054
  • 9
  • 10
  • 2
    +1 because you cited a reference that has a "Best Naming Practices" section (not just opinion) – Fuhrmanator Feb 24 '14 at 02:00
  • 2
    @Fuhrmanator That "reference" is itself an opinion, even though it does provide some justification. It is not a standard by any means - and (even though it is much less terrible than it was) I do *not* recommend or use w3schools as a "reference". There are much more original and comprehensive sources. – user2864740 Dec 18 '14 at 23:11
  • @user2864740 such as? You finished your comment before providing the more original and comprehensive sources. The point of my +1 was that the OP asked for standards but most answers provide opinions. – Fuhrmanator Dec 19 '14 at 01:40
  • This answer ***only* provides opinions**, the link to w3schools is irrelevant and does not remove such. As far as standards, see implementation rules (as in [RSS](http://cyber.law.harvard.edu/rss/rss.html)) or organization rules (as in [OAGi](http://www.oagi.org/oagi/downloads/ResourceDownloads/OAGIS_90_NDR.pdf)) - at some level the "standard" is applied only at a particular application / business level. The w3schools link only provides it's *own* opinion / best practice in a very *vague* sense (it provides a few tips and says "here are some way's it's done"). – user2864740 Dec 19 '14 at 03:03
  • That is, just including a link does not make an answer (or the linked resource) authoritative. – user2864740 Dec 19 '14 at 03:09
7

Many document centred XML dialects use lower case basic Latin and dash. I tend to go with that.

Code generators which maps XML directly to programming language identifiers are brittle, and (with the exception of naive object serialisation, such as XAML) should be avoided in portable document formats; for best reuse and information longevity the XML should try to match the domain, not the implementation.

Pete Kirkham
  • 48,893
  • 5
  • 92
  • 171
3

rss is probably one of the most consumed xml schemas in the world and it is camelCased.

Spec is here: http://cyber.law.harvard.edu/rss/rss.html

Granted it has no node attributes in the schema, but all the node element names are camelCased. For example:

lastBuildDate managingEditor pubDate

evermeire
  • 479
  • 3
  • 7
2

Microsoft embraces two convention:

  1. For configuration, Microsoft uses camelCase. Look at Visual Studio config file. For VS2013, it is stored in:

    C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\devenv.exe.config

Example:

<startup useLegacyV2RuntimeActivationPolicy="true">
  <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.5" />
</startup>
  1. Microsoft also uses UpperCase for their XAML. I guess it is to differentiate from HTML (which uses lowercase).

Example:

<MenuItem Header="Open..." Command="ApplicationCommands.Open">
    <MenuItem.Icon>
        <Image Source="/Images/folder-horizontal-open.png" />
    </MenuItem.Icon>
</MenuItem>
Jeson Martajaya
  • 6,996
  • 7
  • 54
  • 56
2

There is no explicit recommendation. Based on other recommendation from W3C, the one for XHTML, I've opted for lowercase:

4.2. Element and attribute names must be in lower case

XHTML documents must use lower case for all HTML element and attribute names. This difference is necessary because XML is case-sensitive e.g. <li> and <LI> are different tags.

Diego Menta
  • 121
  • 1
  • 2
2

I normally align XML naming convention with the same naming convention in other parts of code. The reason is when I load the XML into Object its attributes and element names can be referred as the same naming convention currently used in the project.

For example, if your javascript using camelCase then your XML uses camelCase as well.

micksatana
  • 183
  • 1
  • 6
  • 1
    While useful for intraproject work, this quickly breaks down when XML is used as a language-agnostic interchange format.. – user2864740 Dec 18 '14 at 23:14
  • So components in your project are consistent, but how do you design the inital standard the project conform to? – Gqqnbig Feb 09 '17 at 19:27
0

XML Naming Rules

XML elements must follow these naming rules:

  • Names can contain letters, numbers, and other characters
  • Names cannot start with a number or punctuation character
  • Names cannot start with the letters xml (or XML, or Xml, etc)
  • Names cannot contain spaces Any name can be used, no words are reserved.

Source: W3 School

petermeissner
  • 12,234
  • 5
  • 63
  • 63
  • A vague description of what kind of names are possible gives little guidance as to which of the possible names should be used. – Samuel Edwin Ward Apr 05 '13 at 13:04
  • Although they define the baseline of what is possible - right? – petermeissner May 05 '13 at 10:14
  • 10
    Sure, but this is like if someone asked "what should I name my kid so they don't get picked on at school" and you replied "well, here's a list of sounds humans are capable of producing." – Samuel Edwin Ward May 05 '13 at 14:53
  • Yeah, but htat actually was not the question, right? Because the questions was: "Is there a standard naming convention for XML elements?" and "Is there any standard, de facto or otherwise, for XML documents?" so this is an answer right? One that answers the question and not only one common stream of interpretation of the question. – petermeissner May 06 '13 at 04:46
  • 3
    It's only an answer if you ignore the rest of the question after those two sentences. You haven't attempted to answer 'which is the "best"' or 'which is better'. – Samuel Edwin Ward May 06 '13 at 14:42
0

I have been searching a lot for a good approach, also reading this thread and some others and I would vote for using hyphens.

They are used broadly in ARIA ( https://developer.mozilla.org/de/docs/Web/Barrierefreiheit/ARIA ) which can be seen in many source codes and are therefore common. As already pointed out here, they are certainly allowed, which is also explained here: Using - in XML element name

Also as a side benefit: When writing HTML in combination with CSS, you often have classes whose names use hyphens as separator by default as well. Now, if you have custom tags that use CSS classes or custom attributes for tags that use CSS classes, then something like:

<custom-tag class="some-css-class">

is more consistent and reads - in my humble opinion - much nicer than:

<customTag class="some-css-class">

Community
  • 1
  • 1
IceFire
  • 4,016
  • 2
  • 31
  • 51