0

So a Long (say in java) is 8 bytes of size and can store quite big number. I want to convert it to string, but without losing memory. Basically take 100 numbers (800 bytes), convert them to a string (that is close to 800 bytes) and then when I need it to convert it back to an array of numbers.

The reasoning for this problem is that I want to store quite a few numbers in my JWT token, so if I take them just as strings, then the size of these numbers will be much bigger than it could be in the ideal world. Any ideas how to achieve this?

Erik A
  • 31,639
  • 12
  • 42
  • 67
eddyP23
  • 6,420
  • 7
  • 49
  • 87

1 Answers1

2

That's basically serializing. Dump the long values into an array of bytes and then encode it into a compatible representation, such as Base64:

import java.util.Base64;

public String encodeLongs(long[] numbers) {
    byte[] bytes = new byte[8 * numbers.length];
    for (int i = 0; i < numbers.length; i++) {
        // Taken from https://stackoverflow.com/questions/18687772/java-converting-long-to-bytes-which-approach-is-more-efficient
        long v = numbers[i];
        int idx = i * 8;
        bytes[idx + 0] = (byte)(v >>> 56);
        bytes[idx + 1] = (byte)(v >>> 48);
        bytes[idx + 2] = (byte)(v >>> 40);
        bytes[idx + 3] = (byte)(v >>> 32);
        bytes[idx + 4] = (byte)(v >>> 24);
        bytes[idx + 5] = (byte)(v >>> 16);
        bytes[idx + 6] = (byte)(v >>>  8);
        bytes[idx + 7] = (byte)(v >>>  0);
    }
    return Base64.getEncoder().encodeToString(bytes);
}

You can also return a byte array instead of a String if that is more convenient for you. Base64 encode incurs in an overhead of around 1/3 of the original size (asuming you use UTF-8 or similar encoding). Note that it is not possible to have zero overhead in general if you are using a text-based format, although you may investigate other encodings such as Base-122, although Base64 has the advantage of being ubiquitous and already implemented in most languages.

Another option would be to compress the byte array first (for example with GZIP) and encode it in Base64 afterwards. Depending on the size of the input, the nature of your numbers (e.g. whether they tend to be in a certain range or not) and the compression algorithm you may have more or less success, but if the numbers are randomly distributed across the whole range of long numbers you will probably not be able to compress a lot.

jdehesa
  • 58,456
  • 7
  • 77
  • 121
  • +1 for the really concise answer AND the credit/reference to the existing SO question from which your solution was drawn. – David W Aug 31 '17 at 13:43
  • @jdehesa, I assume, you didn't suggest just serialising it as a whole object (array of longs), just because it is not as compact? Or am I missing something here. – eddyP23 Aug 31 '17 at 14:17
  • 1
    @eddyP23 Well, I was assuming you wanted some straightforward and interoperable format. I cannot say how much overhead would standard Java serialization introduce, although it is not optimized for size (and it will have to store additional information, such as the size of the array). You can also look into [other serialization libs](https://stackoverflow.com/questions/239280/which-is-the-best-alternative-for-java-serialization) like [Kryo](https://github.com/EsotericSoftware/kryo). In any case, some text-compatible encoding will be needed after if you use binary serialization. – jdehesa Aug 31 '17 at 14:24