47

Apologies if this is a very amateurish question! I know Eclipse uses Cp1252 as the default for its encoding.
I recently created a program using hash maps to convert letters input to Braille. To do this, I had to change the encoding method to UTF-8.

I know very little about either, but everything I've read indicates UTF-8 can represent every character in Unicode and has a much bigger library of recognised symbols.

Why then is it not the preferred encoding style for Eclipse?

John Kiran
  • 79
  • 1
  • 14
Andrew Martin
  • 5,619
  • 10
  • 54
  • 92
  • what do you mean by _Eclipse uses Cp1250 as the default for its encoding._ ? The console in eclipse uses the dafault encoding of your OS. Or are you talking about file IO? – jlordo Dec 09 '12 at 21:35
  • 2
    Eclipse must be using the default encoding on your computer, which must be a Windows box set to CP1250. That's probably what you want to change. – Diego Basch Dec 09 '12 at 21:36
  • Hey - I am a total beginner with this, but this is what I mean. In Properties -> Resource -> Text file encoding -> It it set by default to it "Inherited from container (Cp1252)". To make my file work I had to change it to UTF-8. – Andrew Martin Dec 09 '12 at 21:37
  • @DiegoBasch: Do you know how to change this? – Andrew Martin Dec 09 '12 at 21:38
  • 1
    http://stackoverflow.com/questions/2707986/eclipse-encoding-macroman-utf8 – Michał Ziober Dec 09 '12 at 21:40
  • @mykhaylo: Thanks for your response. I have changed the coding already, I meant is there a way to change it at a computer level, outside of Eclipse, like Diego Basch was implying. – Andrew Martin Dec 09 '12 at 21:44
  • 19
    He's actually asking a valid question: "Why then is it not the preferred encoding style for Eclipse?" that is, why isn't Eclipse pre-configured to assume UTF-8 for text files? That's a very good question. – Isaac Dec 09 '12 at 21:49
  • 3
    I was asking myself exactly the same question, since it is a recurring problem when working in teams. The answer is that it has been an open issue for almost 10 years now: https://bugs.eclipse.org/bugs/show_bug.cgi?id=108668 Vote for it if you do not agree with the "Platform default" (which you usually did not choose) – Didier L Dec 03 '14 at 10:22

1 Answers1

35

When you start Eclipse against a brand new workspace, Eclipse has to decide which encoding to use, by default, when handling certain types of text-based files: text files, Java source files, JSP files, XML and so forth.

By default, then, Eclipse uses the default platform encoding, which is derived from your operating system's settings.

As to why UTF-8 isn't the default encoding for text files, the reason is that still, throughout the world, there is a significant number of plain text files for which UTF-8 is not backward compatible. While UTF-8 is backward compatible with most western encodings, that is not the case for other encodings.

You can change these default encodings by modifying the workspace's settings. Remember, though, that these settings are stored at the workspace level; if you later start a new workspace, the new workspace will have the default encodings set.

To change the default encodings, just go to Workspace -> Preferences, and type "encoding" in the search box at the top left of the dialog. Eclipse will filter the preferences' dialog to contain items that are relevant for encodings.

Isaac
  • 16,458
  • 5
  • 57
  • 81
  • I have Windows 8 - does it use Cp1252 by default (apologies - I originally posted it was Cp1250, but it is Cp1252)? – Andrew Martin Dec 09 '12 at 21:45
  • 1
    Any version of Windows (including 8), set to English-US (and possibly other types of English, such as English-Canada, and certain other languages) will end up defaulting to Cp1252. – Isaac Dec 09 '12 at 21:45
  • Thanks, that's just what I wanted to know. That's because it's a Microsoft encoding tool, isn't it? – Andrew Martin Dec 09 '12 at 21:47
  • 1
    Your question "why UTF-8 isn't the default encoding for text files" is a very good question. I edited my answer to elaborate on that. – Isaac Dec 09 '12 at 21:48
  • 2
    Thanks for the info Isaac, except I'm not in total agreement with your answer on UTF-8 defaults. I would imagine (although admittedly from a western-centric point of view) that the lion's share of plain text files are ASCII. Unless you were thinking of another format, this invalidates your response because UTF-8 was designed to be backwards compatible with ASCII. – David Woods Nov 15 '13 at 03:29