12

Please note that I'm not asking how but why. And I don't know if it's a RCP specific problem or if it's something inherent to java.

My java source files are encoded in UTF-8.

If I define my literal strings like this :

    new Language("fr", "Français"),
    new Language("zh", "中文")

It works as I expect when I use the string in the application by launching it from Eclipse as an Eclipse application :

enter image description here

But if fails when I launch the .exe built by the "Eclipse Product Export Wizard" :

enter image description here

The solution I use is to escape the chars like this :

    new Language("fr", "Fran\u00e7ais"), // Français
    new Language("zh", "\u4e2d\u6587") // 中文

There is no problem in doing this (all my other strings are in properties files, only the languages names are hardcoded) but I'd like to understand.

I thought the compiler had to convert the java literal strings when building the bytecode. So why is the unicode escaping necessary ? Is it wrong to use use high range unicode chars in java source files ? What happens exactly to those chars at compilation and in what it is different from the handling of escaped chars ? Is the problem just related to RCP cache ?

Denys Séguret
  • 372,613
  • 87
  • 782
  • 758
  • 10
    It appears that the Eclipse Product Export Wizard is not interpreting your files as UTF-8. Perhaps you need to run Eclipse's JVM with the encoding set to UTF-8 (`-Dfile.encoding=UTF8` in `eclipse.ini`)? – Matt Ball Jun 27 '12 at 13:11
  • 1
    While this does not really explain why it happens it does suggest an alternative solution and indicates that the export wizard for whatever reason doesn't seem to honor the project's encoding properly: http://stackoverflow.com/questions/6891079/eclipse-rcp-wrong-encoding-when-deploying-the-product – Jiddo Jun 27 '12 at 13:15
  • To confirm @Matt Ball's explanation, witch I think is correct, try setting the following option in the wizard: "Use class files compiled in the workspace" – bruno conde Jun 27 '12 at 13:19
  • 2
    @Jiddo: it does explain why it happens: *"not interpreting your files as UTF-8"*, so it's interpreting them as another encoding incompatible with UTF-8. – m0skit0 Jun 27 '12 at 13:20
  • @MattBall It works. Please build an answer. But I'd like to understand why Eclipse doesn't know what encoding use when exporting even while UTF-8 is the encoding format defined in `preferences/general/workspace` and it knows how to compile them. At the very least an option in the export wizard or the `.plugin` file seems to be needed. – Denys Séguret Jun 27 '12 at 13:24
  • @brunoconde Please can you precise where is this option ? – Denys Séguret Jun 27 '12 at 13:25
  • 1
    @m0skit0 Indeed. What I meant was that it didn't explain why it is not interpreting your files as UTF-8, which I interpreted as what the question was about. Sorry about the confusion. – Jiddo Jun 27 '12 at 13:26
  • @dystroy it is the "Export wizard" > "Options" tab – bruno conde Jun 27 '12 at 13:31
  • @brunoconde I use the "Eclipse Product Export Wizard" from the .product file. I don't have tabs :\ – Denys Séguret Jun 27 '12 at 13:37
  • 1
    @dystroy, sorry I have a plugin environment not RCP. I seems the RCP wizard doesn't have this option. – bruno conde Jun 27 '12 at 13:45
  • OK, thanks for your help. Your observation points to a need similar to the one I was referring at. – Denys Séguret Jun 27 '12 at 13:46
  • @Jiddo: it's not interpreting the file as UTF-8 because that's not their encoding when imported into/created in Eclipse. – m0skit0 Jun 27 '12 at 14:07
  • @m0skit0 those files are generally considered by Eclipse as UTF-8, according to the correct display. This is due to the preference set in `preferences/general/workspace`. – Denys Séguret Jun 27 '12 at 14:09
  • 1
    @dystroy it's probably just a bug in Eclipse's Product Export Wizard. Such things are disturbingly common in a lot of tools. Many developers just don't understand or test encoding issues. – bames53 Jun 27 '12 at 16:50

2 Answers2

10

It appears that the Eclipse Product Export Wizard is not interpreting your files as UTF-8. Perhaps you need to run Eclipse's JVM with the encoding set to UTF-8 (-Dfile.encoding=UTF8 in eclipse.ini)?

(Copypasta'd at OPs request)

Matt Ball
  • 354,903
  • 100
  • 647
  • 710
4

When exporting a plug-in, it gets compiled through a process separate from the normal build process within the IDE. There is a known bug that the build process (PDE.Build) disregards the text encoding used by the IDE.

The export can be made to work properly by specifying the text encoding in the build.properties file of your plugin

javacDefaultEncoding.. =UTF-8
mkdev
  • 972
  • 9
  • 12