0

I have a string which represent a long. Like "12345678901" (11 chars long).

I convert it into a long using Long.parse(), that's fine.

Now, I want to send this long as a short string, like "eR%s" over the wire.

The goal is to have this final string as short as possible. Any idea what's the best way to do that? I can use more characters as the URL encoding (Like I can use /, %, :, etc.)

Prashant Kumar
  • 20,069
  • 14
  • 47
  • 63
jmspaggi
  • 113
  • 1
  • 7
  • 5
    How does `12345678901` convert to `eR%s`? – Sotirios Delimanolis Dec 04 '13 at 20:37
  • 2
    also, people who marked this as a duplicate of "How do I convert Long to byte[] and back in java " might have not really read the question nor the question on "How do I convert Long to byte[] and back in java" ;) I'm not asking how to create a long to byte array. This is very easy. I'm asking how to encode it... Read better next time. – jmspaggi Dec 05 '13 at 01:28
  • Probably [Base64](http://en.wikipedia.org/wiki/Base64) is the best you can do with readily available tools. You could in theory go to Base95 or so in the extreme case, but it wouldn't transmit over the network very well. – Hot Licks Dec 05 '13 at 01:40
  • 1
    The claimed "duplicate" isn't. – Hot Licks Dec 05 '13 at 01:41

4 Answers4

1

Java can handle a radix as high as 36 using the digits 0 - 9 and lower case letters a - z.

> Long.toString(12345678901L, 36)
"5o6aqt1"

> Long.parseLong("5o6aqt1", 36)
12345678901

You could create your own encoding using 65 of the 66 unreserved URI Characters (so your URI would not need escaping). The '-' sign needs to be used for negative numbers:

> Long65.toString(12345678901L)
"aFDIbA"

> Long65.parseLong65("aFDIbA")
12345678901

Here is the code for Long65()

import java.math.BigInteger;

public class Long65 {
    private static int base = 65;
    private static String URIchars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_.~";

    public static String toString(Long aNumber) {
        StringBuilder result = new StringBuilder();
        if (aNumber < 0) {
            result.append('-');
            aNumber = -aNumber;
        }
        int r = (int)(aNumber % base);
        if (aNumber - r == 0) 
            result.append(URIchars.charAt(r));
        else 
            result.append(Long65.toString((aNumber - r) / base) + URIchars.charAt(r));
        return result.toString();
    }

    public static long parseLong65(String aNumber) {
        char[] digits;
        int sign = 1;
        if (aNumber.charAt(0) == '-') {
            sign = -1;
            digits = aNumber.substring(1).toCharArray();
        } else {
            digits = aNumber.toCharArray();
        }
        BigInteger bigBase = BigInteger.valueOf(base);
        BigInteger power = bigBase.pow(digits.length);
        BigInteger total = BigInteger.valueOf(0);
        for (char digit : digits){
            power = power.divide(bigBase);
            total = total.add(power.multiply(BigInteger.valueOf(URIchars.indexOf(digit))));
        }
        return sign * total.longValue();
    }
}
dansalmo
  • 11,506
  • 5
  • 58
  • 53
  • "Long.toString(12345678901L, 36)" will most probably meet my requirements! Thanks. I will take a deeper look into Long65 and see if it's more memory intensive than Long.toString(). I need to call this method to send hundred of billions of lines... So want to avoid GC as much as possible. – jmspaggi Dec 05 '13 at 01:35
  • I made a change to Long65 since I noticed it required `BigInteger` in order to support the full range of a `long` int. The savings in string is 0 to 2 characters per number. – dansalmo Dec 05 '13 at 01:42
  • Base64 is the standard way to do this. – Hot Licks Dec 05 '13 at 01:43
  • Base64 is not an answer. It is just a phrase. Please provide an example on how to use it to solve the problem. Two of the characters used in Base64 must be escaped. Is there a Base64 encoder in Java? – dansalmo Dec 05 '13 at 02:03
0

Use a different base for your number if you want a shorter representation.

Blub
  • 3,762
  • 1
  • 13
  • 24
0

The bigger the numeric base you use, the smaller the representation. You can try base 16 for example:

Long.toString(num, 16)

This should return a string of at most 16 characters.

If that's not small enough, you can build a representation in a bigger base. However, if the resulting string needs to be URL escaped, this may not be useful. In base 256 for example, 8 chars would be enough for any number, but many of the 256 chars need to be escaped, making the resulting text longer. So you have to choose your alphabet carefully, if you choose to implement such an encoding/decoding scheme yourself.

Take a look at http://en.wikipedia.org/wiki/Base64 for example. You can use this Java implementation. You may also be interested in Base85 and its implementations.

Community
  • 1
  • 1
Eyal Schneider
  • 22,166
  • 5
  • 47
  • 78
  • Thanks for pointing to Base85... I will probably give it a try and see if the result is working for me. – jmspaggi Dec 05 '13 at 01:32
0

To reply to the comments Base64 vs other options, I will say, it all depends on the constraints you have on the character set. I'm not talking about transmitting over an URL, but over a char stream which need to be a text stream. I can not simply send a byte array since non-printable chars might cause certain issues.

So I built something like that (see below) which transforms a long in base92 (almost). It uses all the printable chars excluding minus and pipe which I use for delimiters.

It's almost a cut&past of Base65 where I simply build the list of valid digits dynamically. Can be reuse for any base or any list of valid digits.

<!-- language: java -->
public class LongConverter {

  private static String URIchars;

  static {
    StringBuilder result = new StringBuilder();
    for (int i = 32; i < 255; i++) {
      if ((i != 45) && (i != 124))
        result.append((char)i);
    }
    URIchars = result.toString();
  }


  public static String toString(Long aNumber) {
    int base = URIchars.length();
    StringBuilder result = new StringBuilder();
    if (aNumber < 0) {
      result.append('-');
      aNumber = -aNumber;
    }
    int r = (int) (aNumber % base);
    if (aNumber - r == 0) result.append(URIchars.charAt(r));
    else result.append(Long65.toString((aNumber - r) / base) + URIchars.charAt(r));
    return result.toString();
  }

  public static long parseLong(String aNumber) {
    int base = URIchars.length();
    char[] digits;
    int sign = 1;
    if (aNumber.charAt(0) == '-') {
      sign = -1;
      digits = aNumber.substring(1).toCharArray();
    } else {
      digits = aNumber.toCharArray();
    }
    long total = 0;
    long power = 1;
    for (int i = 0; i < digits.length; i++)
      power *= base;
    for (char digit : digits) {
      power /= base;
      total += URIchars.indexOf(digit) * power;
    }
    return sign * total;
  }
}
jmspaggi
  • 113
  • 1
  • 7