3

I am developing a JS plugin for serving of retina images. The attributes that identify these images are supposed to be the following:

data-retina@2x, data-retina@1.5x, data-retina@2.5x.

Could you tell me if these attributes are valid? What characters are allowed (not allowed) in the names of custom data-* attributes in HTML and XHTML?

El cero
  • 607
  • 5
  • 13

2 Answers2

4

See the definition of the data-* attribute in the W3C HTML5 Recommendation:

  • In HTML5, the name must be XML-compatible (and it gets ASCII-lowercased automatically).

  • In XHTML5, the name must be XML-compatible and must not contain uppercase ASCII letters.

The definition of XML-compatible says that it

  • must not contain : characters
  • must match the Name production in the XML 1.0 specification

This Name production lists which characters are allowed.


tl;dr: For the part after data-, you may use the following characters:

  • 0-9
  • a-z
  • A-Z (not in XHTML5)
  • - _ . ·
  • and characters from these Unicode ranges:

    • [#x0300-#x036F] (Combining Diacritical Marks)
    • [#x203F-#x2040] ( )
    • [#xC0-#xD6]
    • [#xD8-#xF6]
    • [#xF8-#x2FF]
    • [#x370-#x37D]
    • [#x37F-#x1FFF]
    • [#x200C-#x200D] (ZERO WIDTH NON-JOINER, ZERO WIDTH JOINER)
    • [#x2070-#x218F]
    • [#x2C00-#x2FEF]
    • [#x3001-#xD7FF]
    • [#xF900-#xFDCF]
    • [#xFDF0-#xFFFD]
    • [#x10000-#xEFFFF]

So the @ (U+0040) is not allowed.

unor
  • 92,415
  • 26
  • 211
  • 360
3

Please refer the Before attribute name state section of the HTML5 Spec:

  • U+0009 CHARACTER TABULATION (tab)
  • U+000A LINE FEED (LF)
  • U+000C FORM FEED (FF)
  • U+0020 SPACE
  • Ignore the character.
  • U+002F SOLIDUS (/)
    Switch to the self-closing start tag state.
  • U+003E GREATER-THAN SIGN (>) Switch to the data state. Emit the current tag token.
    Uppercase ASCII letter
  • Start a new attribute in the current tag token. Set that attribute's name to the lowercase version of the current input character (add 0x0020 to the character's code point), and its value to the empty string. Switch to the attribute name state.
  • U+0000 NULL
  • Parse error. Start a new attribute in the current tag token. Set that attribute's name to a U+FFFD REPLACEMENT CHARACTER character, and its value to the empty string. Switch to the attribute name state.
  • U+0022 QUOTATION MARK (")
  • U+0027 APOSTROPHE (')
  • U+003C LESS-THAN SIGN (<)
  • U+003D EQUALS SIGN (=)
  • Parse error. Treat it as per the "anything else" entry below.
  • EOF
    Parse error. Switch to the data state. Reconsume the EOF character.
  • Anything else
    Start a new attribute in the current tag token. Set that attribute's name to the current input character, and its value to the empty string. Switch to the attribute name state.

In simple words:

It says all characters except tab, line feed, form feed, space, solidus, greater than sign, quotation mark, apostrophe and equals sign will be treated as part of the attribute name. Personally, I wouldn't attempt pushing the edge cases of this though.

Inspired from: What characters are allowed in an HTML attribute name?

Community
  • 1
  • 1
Praveen Kumar Purushothaman
  • 164,888
  • 24
  • 203
  • 252