A legacy software I'm rewriting in Java uses custom (similar to Win-1252) encoding as it's data storage. For the new system I'm building I'd like to replace this with UTF-8.
So I need to convert those files to UTF-8 to feed my database. I know the character map used, but it's not any of the widely known ones. Eg. "A" is on position 0x0041 (as in Win-1252), but on 0x0042 there is a sign which in UTF-8 appears on position 0x0102, and so on. Is there an easy way to decode and convert those files with Java?
I've read many posts already but they all dealt with industry standard encodings of some kind, not with custom ones. I'm expecting it's possible to create a custom java.nio.ByteBuffer.CharsetDecoder
or java.nio.charset.Charset
to pass it to java.io.InputStreamReader
as described in the first Answer here?
Any suggestions welcome.