505

When I run maven install on my multi module maven project I always get the following output:

[WARNING] File encoding has not been set, using platform encoding UTF-8, i.e. build is platform dependent!

So, I googled around a bit, but all I can find is that I have to add:

<properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

...to my pom.xml. But it's already there (in the parent pom.xml).

Configuring <encoding> for the maven-resources-plugin or the maven-compiler-plugin also doesn't fix it.

So what's the problem?

isapir
  • 21,295
  • 13
  • 115
  • 116
Ethan Leroy
  • 15,804
  • 9
  • 41
  • 63
  • 1
    Be careful that UTF-8 encoding is what you actually want to specify as the encoding. You may be better off using a simpler encoding such as ISO-8859-1 (aka Latin-1) or even US-ASCII. – rmp Jan 14 '13 at 18:23
  • 74
    "You may be better off using a simpler encoding such as..." yeah, and bug end-users, as well as other developers... Nowadays it's best to try to use UTF-8 as much as possible and care about other encodings only when a multi-encoding application requirement is thrown to you. Here, we're talking mostly about the encoding of source and configuration files, the encoding of user input is managed differently (with 'java -Dfile.encoding ...' and with a lot of painful programming effort). – zakmck Aug 23 '13 at 09:32
  • I personally decided that the encoding issues were so elusive that I went for encoding ASCII in pom.xml and then took the encoding issues up front. This is naturally prompted by having a non-ASCII character in my name giving issues from day 1:) – Thorbjørn Ravn Andersen May 09 '14 at 15:49
  • What encoding is set in parent pom.xml ? – Ripon Al Wasim May 15 '15 at 12:01

7 Answers7

740

OK, I have found the problem.

I use some reporting plugins. In the documentation of the failsafe-maven-plugin I found, that the <encoding> configuration - of course - uses ${project.reporting.outputEncoding} by default.

So I added the property as a child element of the project element and everything is fine now:

<properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
</properties>

See also http://maven.apache.org/general.html#encoding-warning

Naman
  • 27,789
  • 26
  • 218
  • 353
Ethan Leroy
  • 15,804
  • 9
  • 41
  • 63
  • 1
    So I had this issue and I added the properties from above like this: true local https://earneventapi.intra1.e1.v2.epaas.aexp.com UTF-8 UTF-8 – Bob Small Mar 02 '17 at 16:20
  • No, the only global setting of coding is to be done by env. variable: https://stackoverflow.com/a/9976788/715269 – Gangnus Mar 09 '20 at 15:35
  • 1
    This works as expected while adding the 2 properties to the properties block of the pom.xml file. Thanks. – jpruiz114 Apr 05 '20 at 19:59
  • _SET MAVEN_OPTS=-Dfile.encoding=utf-8_ or unix like _export MAVEN_OPTS=-Dfile.encoding=utf-8_ is the only correct answer ... ;-) – udoline Apr 16 '21 at 09:26
64

This would be in addition to previous, if someone meets a problem with scandic letters that isn't solved with the solution above.

If the java source files contain scandic letters they need to be interpreted correctly by the Java used for compiling. (e.g. scandic letters used in constants)

Even that the files are stored in UTF-8 and the Maven is configured to use UTF-8, the System Java used by the Maven will still use the system default (eg. in Windows: cp1252).

This will be visible only running the tests via maven (possibly printing the values of these constants in tests. The printed scandic letters would show as '< ?>') If not tested properly, this would corrupt the class files as compile result and be left unnoticed.

To prevent this, you have to set the Java used for compiling to use UTF-8 encoding. It is not enough to have the encoding settings in the maven pom.xml, you need to set the environment variable: JAVA_TOOL_OPTIONS = -Dfile.encoding=UTF8

Also, if using Eclipse in Windows, you may need to set the encoding used in addition to this (if you run individual test via eclipse).

Ville Myrskyneva
  • 1,560
  • 3
  • 20
  • 35
  • Not sure if there's a maven way to do this, since this is a JVM setting, not Maven. – Ville Myrskyneva Apr 02 '15 at 04:45
  • 4
    I think you are mixing things up. You only need to set `-Dfile.encoding` if you use I/O in Java without explicitly specifying an encoding (which is not recommended). I don't see what this has to do with scandic letters in Java source files. Non-ASCII in Java source files works with Maven when `project.build.sourceEncoding` is set correctly, as described in Ethan Leroy's answer. – sleske Jul 07 '15 at 12:12
  • @sleske I would assume the same would be enough, but when I first ended here and did the pom.xml changes, it did not fix my problem. After more search and after trial and error the solution described worked. I think that the reason for what happens is because the maven calls the javac of the installed/referred JDK which in turn uses the O/S encoding as default. If someone knows a way to specify the encoding for the javac call in pom.xml would solve this issue in "maven way". – Ville Myrskyneva Oct 01 '15 at 12:35
  • 5
    @VilleMyrskyneva: When Maven invokes `javac`, it will pass along the encoding set by `project.build.sourceEncoding` (you can check using `mvn -X`), so I don't see how what you describe is necessary. If you still get encoding problems in your project, consider asking that as a separate question - it seems you are running into a different problem. Ideally, post a reproducible test case. – sleske Oct 01 '15 at 12:49
  • 1
    @sleske I have project.build.sourceEncoding in pom.xml, but mvn test still have problem with encoding. while that -Dfile.encoding=UTF8 solves it. I don't understand why. http://stackoverflow.com/questions/42990644/maven-test-fails-at-a-eastern-language-character-while-idea-success – Tiina Mar 24 '17 at 03:45
  • IDEA in windows does not need to specify as such when test. No idea why – Tiina Mar 24 '17 at 03:49
58

If you combine the answers above, finally a pom.xml that configured for UTF-8 should seem like that.

pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">

    <modelVersion>4.0.0</modelVersion>

    <groupId>YOUR_COMPANY</groupId>
    <artifactId>YOUR_APP</artifactId>
    <version>1.0.0-SNAPSHOT</version>

    <properties>
        <project.java.version>1.8</project.java.version>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
    </properties>

    <dependencies>
        <!-- Your dependencies -->
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.7.0</version>
                <configuration>
                    <source>${project.java.version}</source>
                    <target>${project.java.version}</target>
                    <encoding>${project.build.sourceEncoding}</encoding>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-resources-plugin</artifactId>
                <version>3.0.2</version>
                <configuration>
                    <encoding>${project.build.sourceEncoding}</encoding>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>
bhdrk
  • 3,415
  • 26
  • 20
  • 1
    the default seems to be ${project.build.sourceEncoding}, so you shouldn't need to define it explicitly for the maven-resources-plugin (see https://maven.apache.org/plugins/maven-resources-plugin/examples/encoding.html, https://maven.apache.org/plugins/maven-resources-plugin/resources-mojo.html#encoding, https://maven.apache.org/general.html#encoding-warning) – George Birbilis May 29 '18 at 23:40
  • No, the only global setting of coding is to be done by env. variable: https://stackoverflow.com/a/9976788/715269 – Gangnus Mar 09 '20 at 15:35
9

It seems people mix a content encoding with a built files/resources encoding. Having only maven properties is not enough. Having -Dfile.encoding=UTF8 not effective. To avoid having issues with encoding you should follow the following simple rules

  1. Set maven encoding, as described above:
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
  1. Always set encoding explicitly, when work with files, strings, IO in your code. If you do not follow this rule, your application depend on the environment. The -Dfile.encoding=UTF8 exactly is responsible for run-time environment configuration, but we should not depend on it. If you have thousands of clients, it takes more effort to configure systems and to find issues because of it. You just have an additional dependency on it which you can avoid by setting it explicitly. Most methods in Java that use a default encoding are marked as deprecated because of it.

  2. Make sure the content, you are working with, also is in the same encoding, that you expect. If it is not, the previous steps do not matter! For instance a file will not be processed correctly, if its encoding is not UTF8 but you expect it. To check file encoding on Linux:

$ file --mime F_PRDAUFT.dsv

  1. Force clients/server set encoding explicitly in requests/responses, here are examples:
@Produces("application/json; charset=UTF-8")
@Consumes("application/json; charset=UTF-8")

Hope this will be useful to someone.

Alexandr
  • 9,213
  • 12
  • 62
  • 102
  • No, the only global setting of coding is to be done by env. variable: https://stackoverflow.com/a/9976788/715269 – Gangnus Mar 09 '20 at 15:34
8

Try this:

<project>
  ...
  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-resources-plugin</artifactId>
        <version>2.7</version>
        <configuration>
          ...
          <encoding>UTF-8</encoding>
          ...
        </configuration>
      </plugin>
    </plugins>
    ...
  </build>
  ...
</project>
fsimon
  • 598
  • 3
  • 6
  • 18
  • Particularly important, we shouldn't forget that not only the sources, but also the resources need this encoding setting. – peterh Mar 23 '17 at 15:37
1

In my case I was using the maven-dependency-plugin so in order to resolve the issue I had to add the following property:

  <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>

See Apache Maven Resources Plugin / Specifying a character encoding scheme

isapir
  • 21,295
  • 13
  • 115
  • 116
0

(As of 2023, but actually has always been so)

If you use Spring Boot, you need to do nothing.

It already applies in parent such properties

<properties>
    ...
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <project.reporting.outputEncoding>UTF8</project.reporting.outputEncoding>
    ...
</properties>

And in general case, these 2 lines above are enough.
And you should not add in any other places or plugins, unless you know what you are doing.

If you see advices to do more, most likely it is something outdated.

Paul Verest
  • 60,022
  • 51
  • 208
  • 332