Maven encoding (sources and resources) is handled by the standard project.build.sourceEncoding
property, which indeed should be present and set to the UTF-8
value, as a good practice.
From official documentation of the maven-resources-plugin
The best practice is to define encoding for copying filtered resources via the property ${project.build.sourceEncoding}
which should be defined in the pom properties section
This property is picked up as default value of the encoding
property of the maven-compiler-plugin
and the encoding
property of the maven-resources-plugin
.
To further enforce its presence, you could then use the maven-enforcer-plugin
and its requireProperty
rule, in order to enforce the existence of the project.build.sourceEncoding
property and its value at UTF-8
. That is, the build would fail if the property was not set AND did not have this exact value.
Below an example of such a configuration, to add to your pom.xml
file, build/plugins
section:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-enforcer-plugin</artifactId>
<version>1.4.1</version>
<executions>
<execution>
<id>enforce-property</id>
<goals>
<goal>enforce</goal>
</goals>
<configuration>
<rules>
<requireProperty>
<property>project.build.sourceEncoding</property>
<message>Encoding must be set and at UTF-8!</message>
<regex>UTF-8</regex>
<regexMessage>Encoding must be set and at UTF-8</regexMessage>
</requireProperty>
</rules>
<fail>true</fail>
</configuration>
</execution>
</executions>
</plugin>
Note, the same could be done for the project.reporting.outputEncoding
property.
Further reading on Stack Overflow:
Bonus: since we are on Stack Overflow, the CEO would probably be happy to see his old article back again: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets
Test
Given the following Java code:
package com.sample;
public class Main {
public void 漢字() {
}
}
and setting the following in Maven:
<properties>
<project.build.sourceEncoding>US-ASCII</project.build.sourceEncoding>
</properties>
Would actually make the build fail, since US-ASCII
is 7 bits and woudl result in illegal character errors. The same would not happen for UTF-8
, which makes uses of 8 bits instead.