2

I have a byte array which contains text padded with values 0 up to fill 16 bytes.

When I try to convert it to String I can not get the right length/String, it always retrieves the whole 16 characters.

I have tried:

import java.nio.charset.StandardCharsets;

public class HelloWorld{

     public static void main(String []args){

        // SIMULATED BYTE[] CONTAINS "ABC" PLUS CHAR(0) UNTIL FILL 16 BYTES
        byte[] name = new byte[16]; 
        for(int i=0; i<16; i++) {
            name[i] = 0;
        }
        name[0] = 'A';
        name[1] = 'B';
        name[2] = 'C';

        // DESIRED OUTPUT IS A STRING = "ABC".
        // I.E. REMOVAL OF PADDING WITH CHAR(0)s
        String nameStr = new String(name, StandardCharsets.US_ASCII);

        System.out.println("#"+nameStr+"#");
        System.out.println(nameStr.length());

     }
}

This is the output:

output

The desired retrieved length is 3, not 16. Also it can be seen in NetBeans output how the String contains the padding 0 values.

I am using OpenJDK8 under FreeBSD and NetBeans 11.

M.E.
  • 4,955
  • 4
  • 49
  • 128
  • Why not use `replaceAll`? Also, I think the for-loop is unnecessary, since the byte array will initially contain zeroes anyways – user May 19 '20 at 15:12
  • The byte[] is read from a file, this is a simplified example to allow posting a replicable example. What I intend to do is to read a variable string (up to 16 chars) but I am reading always 16 chars, even the '0' values stored in the unused byte[] positions. – M.E. May 19 '20 at 15:14
  • 1
    This looks like [XY problem](https://meta.stackexchange.com/q/66377). Why do you build string via byte array in the first place? – Pshemo May 19 '20 at 15:14
  • 1
    related https://stackoverflow.com/questions/17003164/byte-array-with-padding-of-null-bytes-at-the-end-how-to-efficiently-copy-to-sma – Iłya Bursov May 19 '20 at 15:15
  • If you are reading text from file then don't read it as bytes. Use proper Reader or Scanner to handle converting bytes to chars for you. – Pshemo May 19 '20 at 15:15
  • I am reading data from a mapped binary file. The input data is a byte[] of length 16 bits with padding zeros. This is a requirement. – M.E. May 19 '20 at 15:17

3 Answers3

5

The straight-forward approach is to identify the trailing zeros and specify a range to the String constructor:

int end = name.length;
while(end > 0 && name[end - 1] == 0) end--;
String nameStr = new String(name, 0, end, StandardCharsets.US_ASCII);
Holger
  • 285,553
  • 42
  • 434
  • 765
  • bye Algorithmic approach it is correct. – silentsudo May 19 '20 at 15:40
  • try theses `name[1] = 'A';` `name[2] = 'B';` `name[9] = 'C';` – 0xh3xa May 19 '20 at 15:46
  • 2
    @sc0der there is no requirement to remove zero bytes at the beginning, in fact, it’s contradicting the OP’s intention. They are obviously parsing a file format which has zero byte padding at the end of a fixed size field. That’s not an unusual requirement. – Holger May 19 '20 at 15:48
  • I have marked as solution the one that suggests using trim. However this one could be perfectly a valid solution. And I guess that if you have to repeat this operation many times it makes much more sense to use this one. I just feel that for my specific case trim is easier to read and use. – M.E. May 19 '20 at 16:56
1

replaceall

Update the string and remove unwanted characters

String nameStr = new String(name, StandardCharsets.US_ASCII).replaceAll("\0", "");

, output

#ABC#
0xh3xa
  • 4,801
  • 2
  • 14
  • 28
  • 2
    note - this will not work if somehow there are symbols after first `0` in array, which is probably not a big deal for this particular question – Iłya Bursov May 19 '20 at 15:38
  • 2
    @IłyaBursov it’s also a heavy operation, considering that it will use the regex pattern matching engine behind the scenes. – Holger May 19 '20 at 15:46
  • I think regex might be overkilling for my specific case, it is easier to use trim as suggested below. It is good to know anyway. – M.E. May 19 '20 at 16:57
1

You can trim the string

String nameStr = new String(name, StandardCharsets.US_ASCII).trim();

System.out.println("#" + nameStr + "#");
System.out.println(nameStr.length());

Output

#ABC#
3
Butiri Dan
  • 1,759
  • 5
  • 12
  • 18
  • 5
    This removes more than just zero padding bytes, i.e. it also removes space characters at the beginning. – Holger May 19 '20 at 15:44
  • This option is probably the simplest. As the operation takes place just once at the beginning of the reading I do not care if this is more expensive computationally than the one that suggest to identify the 0s and later specify the length in the string constructor. In my specific case there are no spaces at all. – M.E. May 19 '20 at 16:55