39

When the XML file was convert to ASCII. It is different values for user at the three characters of utf and UTF.

<?xml version="1.0" encoding="utf-8"?>


<?xml version="1.0" encoding="UTF-8"?>

I tried to create a new xml file with vs2005. utf-8 form file generated by default.

which one is a more standard definition? thanks.

Nano HE
  • 9,109
  • 31
  • 97
  • 137
  • 3
    Since lowercase letters are more common, `utf-8` will probably take up very slightly less space when compressed. – Zaz Nov 18 '14 at 15:06
  • @Zaz Yes, lowercase compresses better https://encode.ru/threads/1889-gzthermal-pseudo-thermal-view-of-Gzip-Deflate-compression-efficiency – Volker E. Oct 15 '17 at 02:32

5 Answers5

44

The IANA character set registry says:

no distinction is made between use of upper and lower case letters.

But that page, the XML specification, and unicode.org are consistent about capitalizing UTF-8.

dan04
  • 87,747
  • 23
  • 163
  • 198
  • @dan04. I would like to mark your reply as the answer. Thanks for the useful links. @All, Because I need convert the whole xml file to ASCII format and compare the ASCII body .... That's why I care the **upper and lower case letters.**. thank you all. – Nano HE Jul 15 '10 at 02:35
  • 2
    additionally, Googling `charset utf-8 uppercase|lowercase bug|solved` turns up quite a number of bug rapports that were solved/circumvented by using uppercase `UTF-8` while I found no rapports (within one evening of googling this subject) where a problem could be solved changing uppercase to lowercase. Afflicted software included Apache xerces (MacOS X), jsp, jetty (breaking AWS S3 signatures, see: https://github.com/golang/go/issues/19430) and numerous others. Based on this on could make a argument that uppercase UTF-8 charset enjoys better compatibility (especially with legacy tools). – GitaarLAB Jan 26 '18 at 07:31
  • I confirm UTF-8 (uppercase). I get bad encoded results with lowercase characters when using it in MVC CORE 3.1... – Miroslav Siska Nov 16 '20 at 16:36
17

From the XML specification:

"XML processors SHOULD match character encoding names in a case-insensitive way"

This indicates that you can use upper case or lower case or even mixed case if you wish. However, the specification uses "UTF-8" in all its examples so for consistency I'd go with that.

Artelius
  • 48,337
  • 13
  • 89
  • 105
15

For those interested in the gory details - including links to some of the related standards and precedents - I blogged a couple of years ago about Case-Sensitivity of UTF-8 in XML Declarations.

codingoutloud
  • 2,115
  • 19
  • 21
7

In my experience (which is primarily with .NET), character set identifiers are treated as case-insensitive, so UTF-8 and utf-8, as well as Utf-8 or any other variation thereof, always mean the same thing. This would also be the case for other character sets, such as ISO-8859-1 (Latin 1), etc. The casing should not matter, as case is not a meaninful factor in such an identifier.

I do extensive work with web services across multiple platforms, and I have never really seen a "standard" form used. I've seen every variation of a variety of character sets...often different variations from a single business partner.

jrista
  • 32,447
  • 15
  • 90
  • 130
6

Upper-case is the de-facto standard. It should still work with any combination of case, however.

rspeed
  • 1,612
  • 17
  • 21