0

I am trying to write a file using TextEncoder and TextDecoder. I also need to sum 65 to ascii table and do not sum when I deal with line break. I have adapted the solution proposed here to read a file with file API. However I am facing some problems when dealing with encoding.

// write cames from fileEntry.createWriter

var result='0'+String.fromCharCode(124)+'1234'+String.fromCharCode(10); // 0|1234

var asciiArray=[];
var stringArray=[];
var fileContent='';
var tpmBuffer;
var uint8array=new TextEncoder().encode(result); // returns a Uint8Array containing the text given in parameters encoded
uint8array=uint8array.map((byte)=>byte+65); // shift :)

for(var i=0;i<uint8array.length;i++) {
    if(uint8array[i]!==75) {
       asciiArray.push(uint8array[i]);
    } else {
        // I cant shift line break!
        asciiArray.push(10);

        tpmBuffer= new TextEncoder().encode(String.fromCharCode.apply(null,asciiArray));
        stringArray.push(new TextDecoder("utf-8").decode(tpmBuffer));
      console.log(stringArray); //["q½rstu\n"]
        asciiArray=[];
    }
}

var encodedBlob= new Blob(stringArray, {
    encoding:'UTF-8',
    type: 'text/plain;charset=UTF-8'
});

// writer.write(encodedBlob);

When I try to read the content generated, I get the following:

    // Now we read the generated file content with:
    // fileContent = "q½rstu\n"
      var buf= new Uint8Array(fileContent);
      buf=buf.map((byte)=>byte-65);
      var fileAsString= new TextDecoder("ascii").decode(buf);

/*  
output bellow is given by console.log(fileAsString[i], fileAsString.charCodeAt(i));

0 48
 129 -> Why this guy appers?
| 124
1 49
2 50
3 51
4 52
*/

Why this 129 element appears when I read the fileContent if it does not appears when I build the string?

Community
  • 1
  • 1
angellica.araujo
  • 298
  • 1
  • 12

1 Answers1

0

That 129 element comes from buf.map((byte)=>byte-65).
If I can understand this notation, it subtracts number 65 from each byte in buf.

It could work well for // fileContent = "qrstu\n" but wouldn't work as expected if fileContent contains non-ASCII characters (more than 7 bits), e.g for // fileContent = "q½rstu\n" because ½ Vulgar Fraction One Half, codepoint U+00BD, is UTF-8 encoded as byte sequence 0xC2, 0xBD.

And elementary cmd arithmetics set /a 0xc2 - 65 gives result 129.

BTW, I think that buf.map((byte)=>byte-65) might raise an error if ASCII value of a character in fileContent is less than 65 supposing that byte is an unsigned value data type.

JosefZ
  • 28,460
  • 5
  • 44
  • 83