0

I'm using Firebase real-time database for my app. Daily backup is enabled for the database. The database contains data with accents in words such as "Manutenção".

  • If I check this text in the Firebase console it is shown as "Manutenção".
  • If I export the data from the Firebase console it is shown as "Manutenção".
  • But if I download the backup file (.gzip) and after extraction, it is shown as "Manuten√ß√£o". Notice here the encoding of accents. This encoding is according to https://string-functions.com/encodingtable.aspx?encoding=65001&decoding=10000
  1. Why does the .gzip backup file encode the accents?
  2. How to decode these encoded accents programmatically? I tried to use the node module iconv but was not able to convert it.
var Iconv  = require('iconv').Iconv;

var iconv = new Iconv('macintosh', 'UTF-8');
var buffer = iconv.convert('Manutenção');
console.log(buffer.toString()); // Manutenção
  1. how can I get back "Manutenção" from "Manuten√ß√£o"?

Thanks!

Deepak Goyal
  • 4,747
  • 2
  • 21
  • 46

1 Answers1

0

Checking the threads, it seems that it is an issue with macOS

Solution

  const iconv = require('iconv-lite');
  let isMacRomanEncoded = (data.indexOf('¬') > -1) || (data.indexOf('√') > -1);
  if(isMacRomanEncoded){
    // MacRoman encoded, convert to utf-8
    let buffer = iconv.encode(data, 'MacRoman');
    return iconv.decode(buffer, 'utf-8');
  }else{
    // not MacRoman encoded, return the original
    return data;
  }
Deepak Goyal
  • 4,747
  • 2
  • 21
  • 46