0

The goal is to read a file name from a file, which is a max of 100 bytes, and the actual name is the file name filled with "null-bytes".

Here is what it looks like in GNU nano

enter image description here

Where .PKGINFO is the valid file name, and the ^@ represent "null bytes".

I tried here with StringBuilder

package falken;

import java.io.*;

public class Testing {

    public Testing() {
        try {
            FileInputStream tarIn = new FileInputStream("/home/gala/falken_test/test.tar");

            final int byteOffset = 0;
            final int readBytesLength = 100;

            StringBuilder stringBuilder = new StringBuilder();

            for ( int bytesRead = 1, n, total = 0 ; (n = tarIn.read()) != -1 && total < readBytesLength ; bytesRead++ ) {
                if (bytesRead > byteOffset) {
                    stringBuilder.append((char) n);
                    total++;
                }
            }

            String out = stringBuilder.toString();

            System.out.println(">" + out + "<");
            System.out.println(out.length());
        } catch (Exception e) {
            /*
            This is a pokemon catch not used in final code
            */
            e.printStackTrace();
        }
    }
}

But it gives an invalid String length of 100, while the output on IntelliJ shows the correct string passed withing the >< signs.

>.PKGINFO<
100

Process finished with exit code 0

But when i paste it here on StackOverflow I get the correct string with unknown "null-characters", whose size is actually 100.

>.PKGINFO                                                                                            <

What regex can i use to get rid of the characters after the valid file name?

The file I am reading is ASCII encoded.

I also tried ByteArrayOutputStream, with the same result

package falken;

import java.io.*;
import java.nio.charset.StandardCharsets;

public class Testing {

    public Testing() {
        try {
            FileInputStream tarIn = new FileInputStream("/home/gala/falken_test/test.tar");

            final int byteOffset = 0;
            final int readBytesLength = 100;

            ByteArrayOutputStream byteArrayOutputStream =  new ByteArrayOutputStream();

            for ( int bytesRead = 1, n, total = 0 ; (n = tarIn.read()) != -1 && total < readBytesLength ; bytesRead++ ) {
                if (bytesRead > byteOffset) {
                    byteArrayOutputStream.write(n);
                    total++;
                }
            }

            String out = byteArrayOutputStream.toString();

            System.out.println(">" + out + "<");
            System.out.println(out.length());
        } catch (Exception e) {
            /*
            This is a pokemon catch not used in final code
            */
            e.printStackTrace();
        }
    }
}

What could be the issue here?

Gala
  • 2,592
  • 3
  • 25
  • 33
  • Possible duplicate of [How do I convert a byte array with null terminating character to a String in Java?](http://stackoverflow.com/questions/8843219/how-do-i-convert-a-byte-array-with-null-terminating-character-to-a-string-in-jav) – ivan_pozdeev Apr 09 '16 at 21:53

2 Answers2

0

You need to stop appending to the string buffer once you read the first null character from the file.

You seem to want to read a tar archive, have a look at the following code which should get you started.

byte[] buffer = new byte[500]; // POSIX tar header is 500 bytes
FileInputStream is = new FileInputStream("test.tar");
int read = is.read(buffer);
// check number of bytes read; don't bother if not at least the whole
// header has been read
if (read == buffer.length) {
    // search for first null byte; this is the end of the name
    int offset = 0;
    while (offset < 100 && buffer[offset] != 0) {
        offset++;
    }
    // create string from byte buffer using ASCII as the encoding (other
    // encodings are not supported by tar)
    String name = new String(buffer, 0, offset,
            StandardCharsets.US_ASCII);
    System.out.println("'" + name + "'");
}
is.close();

You really shouldn't use trim() on the filename, this will break whenever you encounter a filename with leading or trailing blanks.

Markus
  • 3,155
  • 2
  • 23
  • 33
0

Well, it seems to be reading null characters as actual characters, spaces in fact. If it's possible, see if you can read the filename, then, cut out the null characters. In your case, you need a data.trim(); and a data2 = data.substring(0,(data.length()-1))

Rocket6488
  • 103
  • 1
  • 9
  • That is it! .trim on the string was the magic trick! It now returns correctly 8 characters. – Gala Apr 09 '16 at 21:38
  • Looks like a rather suboptimal solution though. First read garbage (lots of garbage) one char at a time, then trim it off. Not to mention lots of copying around. – ivan_pozdeev Apr 09 '16 at 21:53
  • And it of course fails if the filename ends with blank characters. – Markus Apr 09 '16 at 21:56