3

I'm trying to load a proprieties from a text file, but the accented characters (saül) comes in a different encoding other than UTF-8 how to avoid it?

My property file have a property with an accented character (saül). How ever when I remote debug I find that properties.load(bufferedReader); takes that as saül so when I write to another file it gets written as saül, I have UTF-8 encoding everywhere else in the application. I'm not sure what I'm doing wrong while reading the properties from the file.

try {
    final String propertyFilePath = System.getProperty(JVM_ARGUMENT_NAME);
    if (StringUtils.hasText(propertyFilePath)) {
        setLocalOverride(true);
        resource = getApplicationContext().getResource(propertyFilePath);
        BufferedReader bufferedReader =
            new BufferedReader(new InputStreamReader(new FileInputStream(propertyFilePath), "UTF8"));
        properties.load(bufferedReader);
        externalFilePasswordConfigurer.afterPasswordPropertiesSet(properties);
        LOGGER.info("ExternalFilePropertyConfigurer UTF-8 Reader");
    }
    setProperties(properties);
    logProperties(properties);
} catch (Exception e) {
    LOGGER.error("ExternalFilePropertyConfigurer setter failed to set properties: ", e);
    throw new RuntimeException(e);
}
Runcorn
  • 5,144
  • 5
  • 34
  • 52
user2599381
  • 727
  • 1
  • 8
  • 16
  • possible duplicate of [How to use UTF-8 in resource properties with ResourceBundle](http://stackoverflow.com/questions/4659929/how-to-use-utf-8-in-resource-properties-with-resourcebundle) – ruediste Dec 08 '14 at 17:41
  • I don't think it's a duplicate of that (ResourceBundle vs Properties). – Brett Kail Dec 08 '14 at 17:44
  • That said, I'm not sure I understand the question. If the properties file is using a non-UTF-8 encoding, then use a different parameter to InputStreamReader? – Brett Kail Dec 08 '14 at 17:44
  • Agreed with @bkail. The character encoding parameter to InputStreamReader should be changed. – Andy Dufresne Dec 09 '14 at 06:01
  • Hi guys thanks for the comments, my property file have a property with an accented character (saül). How ever when I remote debug I find that **properties.load(bufferedReader);** takes that as saül so when I write to another file it gets written as saül, I have UTF-8 encoding everywhere else in the application. I'm not sure what I'm doing wrong while reading the properties from the file. – user2599381 Dec 09 '14 at 09:30
  • Are you sure the properties file is actually written in UTF-8 format? If you use xxd (or write your own program to dump the raw file bytes), does the file contain 0xc3 0xbc for the ü? – Brett Kail Dec 09 '14 at 14:33
  • I have no control over that file. it seems like it is not UTF-8 – user2599381 Dec 09 '14 at 15:58

2 Answers2

6

Old question, but as far as I know any .properties file must be in ISO-8859-1 charset or there'll be troubles.

When accented characters are required inside a properties file, each character must be replaced with its unicode version. In this particular case "saül" must be changed to "sa\u00FCl", where \u00FC is "ü".

Another solution is change the file type from .properties to .xml

See java documentation here:

The load(Reader) / store(Writer, String) methods load and store properties from and to a character based stream in a simple line-oriented format specified below. The load(InputStream) / store(OutputStream, String) methods work the same way as the load(Reader)/store(Writer, String) pair, except the input/output stream is encoded in ISO 8859-1 character encoding. Characters that cannot be directly represented in this encoding can be written using Unicode escapes as defined in section 3.3 of The Java™ Language Specification; only a single 'u' character is allowed in an escape sequence. The native2ascii tool can be used to convert property files to and from other character encodings.

Wilfredo Pomier
  • 1,091
  • 9
  • 12
0

I know that this question is old, but I got the same issue and didn't want to change the accented characters to its unicode encoded version.

So I added the following plugin to my pom.xml

<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>native2ascii-maven-plugin</artifactId>
    <version>2.0.1</version>
    <executions>
            <execution>
                    <goals>
                            <goal>resources</goal>
                    </goals>
                    <phase>process-resources</phase>
                    <configuration>
                            <srcDir>src/main/resources</srcDir>
                            <targetDir>${project.build.outputDirectory}</targetDir>
                            <encoding>${project.build.sourceEncoding}</encoding>
                            <includes>
                                    <include>message.properties</include>
                            </includes>
                    </configuration>
            </execution>
    </executions>
</plugin>

you can read more about the plugin here https://github.com/mojohaus/native2ascii-maven-plugin