0

Having given a command line parameter which is a hex string I use org.apache.commons.codec.binary.Hex.decodeHex to get a byte[].

But of course Java bytes are signed. I'd like to treat this as a bunch of unsigned bytes (so where I have a byte value of, say, -128 I want it to be 0x80 = 128, or for -103 I want it to be 0x99 = 153.

It's not going to fit in a byte[] anymore, so let's make it a char[] (short[] would also work).

What is the way to do this in Java. I can write the obvious loop (or better: stream pipeline) but is there something better, built-in, e.g. some library method that I'm unaware of?

  • This isn't something java.nio.ByteBuffer does
  • java.nio.charset.CharsetDecoder has an API with the right signature but I don't know if there is (or how to get) an "identity" decoder.

(No work to show: internet searches turned up nothing like this.)

davidbak
  • 5,775
  • 3
  • 34
  • 50
  • Does this help? https://stackoverflow.com/a/46949852/1553851 – shmosel Nov 13 '22 at 07:02
  • 1
    This is generally unnecessary, most of the time you can interpret the bytes as unsigned at the point of usage (ie use `bytes[i] & 0xFF`). Pre-converting them like this is more complex and costs more space and time. Of course I don't know *why* you're doing this so perhaps you have a good reason, but very often when this is done it's a mistake. – harold Nov 13 '22 at 07:24
  • 1
    You must mean `decodeHex`. The `encodeHexString` converts bytes to a `String` ... – Stephen C Nov 13 '22 at 08:23

5 Answers5

2

I can write the obvious loop (or better: stream pipeline) but is there something better, built-in, e.g. some library method that I'm unaware of?

AFAIK, no.

There is no builtin "library method or something better" to do this. Just do it the clunky way. (FWIW, using a loop will most likely be more efficient than using streams.)

Better still, figure out a way to avoid doing it at all; e.g. keep the byte[] and use Byte.toUnsignedInt whenever you want to use the individual bytes in the array as unsigned.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • TY. w.r.t. "loop will be most likely be more efficient than using streams" obv compilers/jitters writers have had a much longer time to work on optimizing loops than streams but out of curiosity: is there any hope there for streams getting comparable treatment? The func prog community knows how do to do it ... – davidbak Nov 13 '22 at 17:26
  • Ummm ... I can't predict the future. To achieve the performance of an optimized Java loop, they would essentially need to transform the stream code into an optimized loop. That seems a stretch. And I'm yet to see a functional language implementation that actually performs on a par with well written (classic) Java for a simple task like this. The real goal of Java streams is to express complicated transformations succinctly with acceptable (rather than optimal) performance. – Stephen C Nov 14 '22 at 00:02
  • 1
    The JIT compiler *can* optimize a stream pipeline to the same degree as a loop (after inlining all involved methods). The problem is that it only does this if the code is executed often enough and more than often, you don’t execute the same code that often. And a not-fully-optimized loop will beat a not-fully-optimized stream pipeline. It also doesn’t help that in case of the HotSpot JVM, the `-XX:MaxInlineLevel` option has a ridiculously low default (`9` prior to JDK 14) inherited from ancient versions. You can easily use `-XX:MaxInlineLevel=20` which will help pipelines a lot. – Holger Nov 21 '22 at 09:01
0

Try this.

byte[] byteArray = {0, 1, -128, -103};
int[] intArray = IntStream.range(0, byteArray.length)
    .map(i -> byteArray[i] & 0xff)
    .toArray();
System.out.println(Arrays.toString(intArray));

output:

[0, 1, 128, 153]
0

You're starting from a hex String so you could take this approach, quite an 'old school' one since Java is not particularly good at streaming byte[]:

import java.io.IOException;
import java.util.Arrays;
import java.io.StringReader;

public class HexInts {
    public static void main(String[] args) throws IOException {
        String s = args[0];
        int[] ints = new int[s.length() / 2];
        char[] buf = new char[2];
        int ix = 0;
        StringReader in = new StringReader(s);
        while ((in.read(buf)) > -1) {
            ints[ix++] = Integer.parseInt(String.valueOf(buf), 16);
        }
        System.out.println(Arrays.toString(ints));
    }
}
g00se
  • 3,207
  • 2
  • 5
  • 9
0

In general, to convert a single value to unsigne is enought to mask with & 0xFF. You can do it natively like b & 0xFF or use method from Byte.toUnsignedInt(b).

public static int[] convert(byte[] byteArray) {
    int[] intArray = new int[byteArray.length];

    for (int i = 0; i < byteArray.length; i++) {
        intArray[i] = byteArray[i] & 0xFF;
        // or this
        // intArray[i] = Byte.toUnsignedInt(byteArray[i]);
    }

    return intArray;
}

I think it's not a good idea to use Stream, because use less memory. I think POJO is much better in this case.

Oleg Cherednik
  • 17,377
  • 4
  • 21
  • 35
0

I ended up with this loop - just FYI:

    private @NotNull short[] toUnsignedBuffer(@NotNull byte[] signedBuffer) {
        short[] r = new short[signedBuffer.length];
        int i = 0;
        for (byte b : signedBuffer) r[i++] = (short) toUnsignedInt(b);
        return r;

        // Q: Why doesn't `Arrays` have a `static void SetAll(byte[] array, IntToShortFunction
        // generator)`?
        // A: Because `short` is the bastard stepchild of Java's framework libraries.  P.S., there's
        // no `IntToShortFunction` interface either ... or `ShortStream` class,  or
        // `Streams::toArray` overload that'll give you a `short[]`, etc. etc. etc.
    }
davidbak
  • 5,775
  • 3
  • 34
  • 50
  • I'd be interested to see the missing code (the origin being a hex string) – g00se Nov 14 '22 at 04:55
  • @g00se - It's `org.apache.commons.codec.binary.Hex.decodeHex` if I understand your question. (Where I get the byte array from a hex string?) – davidbak Nov 14 '22 at 05:42
  • You can remove that dependency by using Java 17 onwards which has `HexFormat`. That might reduce more than one dependency from commons. – g00se Nov 14 '22 at 08:29