0

I am converting byte [] into a string. Everytime that I convert the byte array to a string, it has a prefixed-type character before it every single time. I have tried different characters, uppercase, etc.. Still has the prefix.

When I write the byte code to system output, it still has the character.

System.out.write(theByteArray);

System.out.println(new String(theByteArray, "UTF-8"));

When I write the text to a file, it seems like the byte array printed flawlessly, but then I scan it and end up with the weird prefix symbol...

Text to be encrypted >

"aaaa"

Text when decrypted and converted to a string >

"aaaa"

The Character seems to disappear, here is an image of it.

Image of characters

I am wanting to compare the given string to another string, kind of like decrypting a password, and comparing it to a database. If one matches, then it gives access.

Code that is generating this byte code.

Keep in mind, the byte I am looking at is decData, and this is NOT my code.

byte[] encData;
        byte[] decData;
        File inFile = new File(fileName+ ".encrypted");

        //Generate the cipher using pass:
        Cipher cipher = FileEncryptor.makeCipher(pass, false);

        //Read in the file:
        FileInputStream inStream = new FileInputStream(inFile);

        encData = new byte[(int)inFile.length()];
        inStream.read(encData);
        inStream.close();
        //Decrypt the file data:
        decData = cipher.doFinal(encData);
        //Figure out how much padding to remove

        int padCount = (int)decData[decData.length - 1];

        //Naive check, will fail if plaintext file actually contained
        //this at the end
        //For robust check, check that padCount bytes at the end have same value
        if( padCount >= 1 && padCount <= 8 ) {
            decData = Arrays.copyOfRange( decData , 0, decData.length - padCount);
        }
        FileOutputStream target = new FileOutputStream(new File(fileName + ".decrypted.txt"));
        target.write(decData);
        target.close();

1 Answers1

0

Looks like encData contains BOM and I think Java, when reading in a stream with BOM, will just treat the BOM as an UTF-8 character, which caused the "prefix". You can try the solution suggested here: Reading UTF-8 - BOM marker.

On the other hand, byte order mark is optional and not recommended for UTF-8 encoding. So two questions to ask is:

  1. Is the original data encoded using utf-8?
  2. If it is, it might be worth while to find out why did the BOM gets into the original data in the first place.
Community
  • 1
  • 1
Alvin
  • 10,308
  • 8
  • 37
  • 49