I am building a web application in NodeJS version 12. I have data from an old MySQL database. There are several fields that contain characters that are not displaying properly due to an encoding issue with the old database. There are some similar questions already but none of them have solved my issue. After trying, I'm a little closer to a solution, but still need help on this.
Current value in database to convert:
Rikuchi SokuryoÌ„bu [cartographer], 陸地測é‡éƒ¨
Desired new database value:
Rikuchi Sokuryōbu [cartographer], 陸地測量部
The issue is the same as described in this similar question. However, the accepted answer does not solve my issue. I need to write a NodeJS to convert the data in the database into a readable string.
I also tried following the answer in this similar question. I understand that the value needs to be first converted to binary and then to the desired encoding. However, it does not return the desired result. I tried this with the iconv and iconv-lite packages.
ATTEMPT 1:
let buf = new Buffer(body, 'binary');
let conv = new iconv.Iconv('windows-1252', 'utf8');
let str = conv.convert(buf).toString();
console.log(`original: ${body} output: ${str.toString()}`);
// original: Rikuchi SokuryoÌ„bu [cartographer], 陸地測é‡éƒ¨
// output: Rikuchi SokuryoМbu [cartographer], й"ёеS°жё¬й!Џй’Ё
ATTEMPT 2: iconv-lite
let buf = new Buffer(body, 'binary');
const str = iconvlite.decode(buf, 'windows-1252');
console.log(`original: ${body} output: ${str.toString()}`);
// original: Rikuchi SokuryoÌ„bu [cartographer], 陸地測é‡éƒ¨
// output: Rikuchi SokuryoМbu [cartographer], й"ёеS°жё¬й!Џй’Ё
ATTEMPT 3: iconv-lite
// This one *almost* works however there are still some undefined characters
let buf = new Buffer(body, 'utf-8');
const win = iconvlite.encode(buf, 'windows-1252');
console.log(`original: ${body} output: ${win.toString()}`);
// original: Rikuchi SokuryoÌ„bu [cartographer], 陸地測é‡éƒ¨
// output: Rikuchi Sokuryōbu [cartographer], 陸地測�?部
UPDATE:
This website string-functions.com can encode and decode strings.
The entire problematic string is decoded correctly with the settings: "Encode with: Windows-1252" and "Decode with: utf-8"
It also works perfectly for larger examples of this problem. I just need to replicate exactly how this site is doing the conversion. My code in attempt #3 is very close, but there must be a step missing.