4

Thanks to people who answered, I could realize I was indeed dealing with files encoded in "MacRoman":

In which encoding is 0xDB a currency symbol?

So I'm wondering: are charsets contained in lib/charsets.jar guaranteed to exist in all the 1.5 JVMs?

http://download.oracle.com/javase/1.5.0/docs/guide/intl/encoding.doc.html

Unlike, say, UTF-8, x-MacRoman is not present in rt.jar but in lib/charsets.jar. I don't understand the difference very well.

Is MacRoman, just like UTF8, guaranteed to be present?

P.S: would be great is someone could create the MacRoman tag.

Community
  • 1
  • 1
NoozNooz42
  • 4,238
  • 6
  • 33
  • 53
  • 2
    Under Java it appears as either "MacRoman" or "x-MacRoman" but seen that SO only allows lowercase for tags "MacRoman" becomes "macroman" which reads like "macro man", which is of course silly (thanks SO), so I put "mac-roman" (not good but, hey, what can you do ;) – SyntaxT3rr0r Jan 30 '11 at 17:34
  • 1
    Can you convert the source material to UTF-8 and be done with it? I don't see a very compelling reason to leave data in these legacy character sets these days. – asveikau Jan 30 '11 at 17:47
  • 3
    @asveikau: what makes you think I've got control over the files I'm receiving? If I *knew* it was MacRoman I wouldn't have asked the (linked in my question) question: http://stackoverflow.com/questions/4844180 and, also, to convert the source to UTF-8 maybe, just maybe, that this is precisely the reason why I'm asking in this very question if MacRoman is mandatory by the Java specs to be present in every JVM? I, somehow, don't think your comment brings anything to SO. The question is clear and precise, so are the answers. – NoozNooz42 Jan 30 '11 at 18:02
  • Calm down. This is not a personal insult. I was only saying that it might make more sense to convert the files, if that is an option. (You did not specify whether or not that is an option. Even to this very minute it's not clear if it's feasible for you to make a local copy and work against that.) Note also that there are other ways to convert the files than modifying your Java source that will eventually work with the file. For example `iconv` may be suitable. – asveikau Jan 30 '11 at 18:11

1 Answers1

6

It's not mandatory. The only charsets which must be supported by every Java implementation are:

  • US-ASCII
  • ISO-8859-1
  • UTF-8
  • UTF-16LE
  • UTF-16BE
  • UTF-16

Sun/Oracle also has a list of the charsets supported by their JRE.

dkarp
  • 14,483
  • 6
  • 58
  • 65