8

Currently I am working at an algorithm to encode a normal string with each possible character to a Base36 string.

I have tried the following but it doesn't work.

public static String encode(String str) {
    return new BigInteger(str, 16).toString(36);
}

I guess it's because the string is not just a hex string. If I use the string "Hello22334!" In Base36, then I get a NumberFormatException.

My approach would be to convert each character to a number. Convert the numbers to the hexadecimal representation, and then convert the hexstring to Base36.

Is my approach okay or is there a simpler or better way?

Patrick Vogt
  • 898
  • 2
  • 14
  • 33
  • I don't get how "each possible character" and using `BigInteger` with base 16 should fit together. You'll probably want to convert the string to bytes first and convert those. Just keep in mind that the byte representation of a string depends on the encoding that is used and that if you don't provide an encoding the system default will be used (and this can change when running on a different system). – Thomas Jan 13 '17 at 11:43
  • I just tried it and it doesn't work. The problem is, I know of no possible solution. – Patrick Vogt Jan 13 '17 at 11:45
  • 2
    You could have a look at how `java.util.Base64` is implemented and adapt that to base 36. – Thomas Jan 13 '17 at 11:48

2 Answers2

15

First you need to convert your string to a number, represented by a set of bytes. Which is what you use an encoding for. I highly recommend UTF-8.

Then you need to convert that number, set of bytes to a string, in base 36.

byte[] bytes = string.getBytes(StandardCharsets.UTF_8); 
String base36 = new BigInteger(1, bytes).toString(36);

To decode:

byte[] bytes = new Biginteger(base36, 36).toByteArray();
// Thanks to @Alok for pointing out the need to remove leading zeroes.
int zeroPrefixLength = zeroPrefixLength(bytes);
String string = new String(bytes, zeroPrefixLength, bytes.length-zeroPrefixLength, StandardCharsets.UTF_8));

private int zeroPrefixLength(final byte[] bytes) {
    for (int i = 0; i < bytes.length; i++) {
        if (bytes[i] != 0) {
            return i;
        }
    }
    return bytes.length;
}
Christoffer Hammarström
  • 27,242
  • 4
  • 49
  • 58
  • While this picks up using `BigInteger`, I consider that a weird approach to begin with. – greybeard Jan 13 '17 at 11:51
  • 1
    No problem. :) I did edit the answer slightly. You can pass `1` as the first parameter to the BigInteger constructor to make it always a positive number. Please mark the answer accepted if it answered your question. :) – Christoffer Hammarström Jan 13 '17 at 11:51
  • @alalamin: Something like `new String(new Biginteger(base36, 36).toByteArray(), StandardCharsets.UTF_8)` should do it, but i haven't tested it. – Christoffer Hammarström Oct 18 '17 at 11:23
  • @ChristofferHammarström thank you very much. It worked perfectly. – Al-Alamin Oct 19 '17 at 02:58
  • 1
    The decoding might not always give you the same result, as BigInteger prefixes the byte array with 0x00 in some cases. – Alok Mar 23 '18 at 02:25
  • @Alok Thanks, good to know. I didn't actually test decoding myself. – Christoffer Hammarström Mar 26 '18 at 07:34
  • @ChristofferHammarström in the UTF_8 case, the leading 0 case might not get triggered. But in general, you most definitely shouldn't use BigInteger to convert from/to arbitrary bases. – Alok Mar 26 '18 at 17:30
  • @Alok "But in general, you most definitely shouldn't use BigInteger to convert from/to arbitrary bases." -- Why not, and what should be used instead? Just converting bases with `BigInteger` does not involve a byte array. – Christoffer Hammarström Mar 26 '18 at 18:40
  • @ChristofferHammarström When dealing with different bases, you often end up with byte arrays/strings. A simple example with the leading 0 can be seen with: `new BigInteger(0x90).toByteArray()` results in `{0, -112}`. Your answer fails with: `new String(new BigInteger(new BigInteger(1, "é".getBytes(StandardCharsets.UTF_8)).toString(36), 36).toByteArray(), StandardCharsets.UTF_8)` – Alok Mar 27 '18 at 18:28
  • @Alok: You can convert bases without using a byte array. Converting bases isn't the problem, the byte array is. `new BigInteger("123", 10).toString(36)` works fine. – Christoffer Hammarström Mar 28 '18 at 07:07
2

From Base10 to Base36

public static String toBase36(String str) {
        try {
            return Long.toString(Long.valueOf(str), 36).toUpperCase();
        } catch (NumberFormatException | NullPointerException ex) {
            ex.printStackTrace();
        }
        return null;
    }

From Base36String to Base10

public static String fromBase36(String b36) {
        try {
            BigInteger base = new BigInteger( b36, 36);
            return base.toString(10);
        }catch (Exception e){
             e.printStackTrace();
        }
       return null;
    }