After a few tests and search I found the solution.
To change the charset encode of a file I need to read and write the file applying the new target charset, but to create something generic which could receive any charset I need to identify the source charset.
To achieve that I add a dependency called UniversalDetector
:
<dependency>
<groupId>com.github.albfernandez</groupId>
<artifactId>juniversalchardet</artifactId>
<version>2.3.1</version>
</dependency>
Using it I could do this:
encoding = UniversalDetector.detectCharset(file.getInputStream());
if (encoding == null) {
//throw exception
}
And the method for transform the file:
private static void encodeFileInLatinAlphabet(InputStream source, String fromEncoding, File target) throws IOException {
try (BufferedReader reader = new BufferedReader(new InputStreamReader(source, fromEncoding));
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(target),
StandardCharsets.ISO_8859_1))) {
char[] buffer = new char[16384];
int read;
while ((read = reader.read(buffer)) != -1)
writer.write(buffer, 0, read);
}
}
So I could receive any charset and encode in the desired charset.
Note: In my case I always need the file in ISO_8859_1
so that why in the method is fixed, but you could receive the target charset as a parameter.