3

I've seen lots of somewhat related questions and answers on this forum, but haven't found anything that addresses my problem.

The idea is simple enough:

(This is using the Java programming language - I am currently restricted to using Java 7)

An array of bytes (representing anything: some wire format, some form of encoded data, binary data with embedded "text", etc.) is received. I'd like to be able to print the array in the following forms:

  1. as a hexadecimal string
  2. as "printable" text

The first case is partly for debugging reasons, but also may have use in a non-debugging mode. The second case is purely for debugging reasons, and would allow for human comparison with other sources of information.

I think I have a solution for the first case, but am struggling with the second case.

Obviously an array of bytes may contain non-printable characters or characters that are rendered in different ways. How can I "print" an array so that all bytes are represented? A good analogue I can provide is the UNIX od command, where od -ah displays the binary data in a hexadecimal form as well as ASCII characters. In this case dots (.) are typically used in place of non-printable or control characters.

I don't need it to specifically look like what od puts out, but would like to be able to show the data so that at least the printable characters can be seen and the rest represented with some sort of placeholder. Also, I don't want to remove the non-printable characters, as that would give a misleading representation of the data.

If anyone has information on how to achieve this, I would really appreciate it.

Dave Newton
  • 158,873
  • 26
  • 254
  • 302
Joseph Gagnon
  • 1,731
  • 3
  • 30
  • 63
  • 1
    1) just print bytes as hexedecimals 2) Create String with new String(byte[]) - define non-printable characters and replace them beforehand – Antoniossss May 07 '18 at 18:08
  • @Antoniossss But `new String(byte[])` uses the user's default character encoding, so to do the replacement step, you'd have to figure out which it is and know which of that character set's characters are non-printable. Seems a bit complex for a speculative debug dump. – Tom Blodget May 07 '18 at 18:21
  • I would use UTF-8 since you dont care about encoding anyway – Antoniossss May 07 '18 at 18:35

3 Answers3

3

Try this for #2:

String original = new String(bytes);
String printable = original.replaceAll("\\P{Print}", ".")); //Or any other character instead of "." you want

It uses the POSIX printable character class.

If you need Unicode support, create the String with the utf-8 character set argument, and construct a Pattern with the UNICODE_CHARACTER_CLASS flag and use the same regex as above.

Vasan
  • 4,810
  • 4
  • 20
  • 39
  • I would think that ASCII representation should be good enough for what I'm doing. That being said, I'm open to using whatever encoding makes the most sense. I'll admit I know very little about character encoding. I'll also need to research your RE example, as I don't know what that does. – Joseph Gagnon May 07 '18 at 18:28
  • @JosephGagnon If ASCII will do, you can use my code snippet as-is. – Vasan May 07 '18 at 18:29
  • Yes, that did the trick. Thanks. I did try with the UTF-8 as well, but I must have done something wrong - it displayed pretty much the same output, but behaved stragely. – Joseph Gagnon May 07 '18 at 18:50
0

Hmm, does Arrays.toString() work for you?

public class PrintBytes {
   public static void main( String[] args ) {
      byte[] test = { 1, 2, 3, 0, (byte)0xFF, (byte)0xFE };
      String s = Arrays.toString( test );
      System.out.println( s );

   }
}

Output:

run:
[1, 2, 3, 0, -1, -2]
BUILD SUCCESSFUL (total time: 0 seconds)

It's not hex but it is a usable wire format.

markspace
  • 10,621
  • 3
  • 25
  • 39
0

And why don't use DatatypeConverter here is an example with two array. The first one from word 'hello', second one is invented.

import java.nio.charset.StandardCharsets;
import javax.xml.bind.DatatypeConverter;

public class MainTest {

    public static void main(String[] args) {
        // Something printable
        String hello = "hello";
        byte[] printable = hello.getBytes(StandardCharsets.UTF_8); 
        System.out.println("Printable: " + new String(printable, StandardCharsets.UTF_8));

        // Something not printable
        byte[] notPrintable = { (byte)-156, (byte)190, (byte)56, (byte)-29, (byte)1};

        System.out.println("NotPrintable: " + new String(notPrintable, StandardCharsets.UTF_8));

        // With DatatypeConverter.printBase64Binary you can get a String printable from any byte[]
        final String printableString = DatatypeConverter.printBase64Binary(printable);
        final String notPrintableString = DatatypeConverter.printBase64Binary(notPrintable);

        System.out.println("Printable string with DatatypeConverter: " + printableString);
        System.out.println("Not printable string with DatatypeConverter: " + notPrintableString);

        byte[] otherPrintable = DatatypeConverter.parseBase64Binary(printableString);
        byte[] otherNotPrintable = DatatypeConverter.parseBase64Binary(notPrintableString);

        // And the best point is that you can revert it
        final String otherHello = new String(otherPrintable, StandardCharsets.UTF_8);
        final String strangeString = new String(otherNotPrintable, StandardCharsets.UTF_8);

        System.out.println("Other Hello: " + otherHello);
        System.out.println("Strange String: " + strangeString);
    }

}

Here is the output

Printable: hello
NotPrintable: d�8�
Printable string with DatatypeConverter: aGVsbG8=
Not printable string with DatatypeConverter: ZL444wE=
Other Hello: hello
Strange String: d�8�

EDIT: DatatypeConverter is valid for java7 and java8, take a look to Java 11 package javax.xml.bind does not exist

Juan
  • 544
  • 6
  • 20