EBCDIC - like ASCII or Latin-1 - is text. You can try one of "Cp037", "Cp500", "Cp1047"
. As there are more than one EBCDIC variant check Wikipedia or such. Unfortunately not every Charset is provided by the Java SE. See Convert String from ASCII to EBCDIC in Java?
Since java 11 you can use Files.readString/writeString, otherwise one needs to use Files.readAllBytes.
Path ebcdicPath = Paths.get("...");
Path utf8Path = ebcdicPath.resolveSibling("utf8.txt");
Charset ebcdic = Charset.forName("Cp1047");
String content = Files.readString(ebcdicPath, ebcdic);
Files.writeString(utf8Path, content, StandardCharsets.UTF_8);
You might get problems with the line endings, as in Unicode the EBCDIC originating NEL (U+0085) is a legal newline/carriage return. Using Files.lines
would string line endings.
Code for a hex dump of some bytes:
Path path = Paths.get("...");
byte[] content = Files.readAllBytes(path);
for (int i = 0; i < 16; ++i) {
System.out.printf(" %02x", content[i] & 0xFF);
}
System.out.println();
byte[] c = {(byte)0xf0, (byte)0xf0, (byte)0xf0, (byte)0xf0, (byte)0xf0, (byte)0xf9, (byte)0xf7, (byte)0xf7,
(byte)0xf1, (byte)0xf2, (byte)0xf2, (byte)0xf0, (byte)0xf3, (byte)0xf2, (byte)0xf1, (byte)0xf0};
Charset ebcdic = Charset.forName("Cp1047");
System.out.println(new String(c, ebcdic));
0000097712203210