5

I have been trying to reduce the length of the way. I represent some integer ID's in my program. For Example

2
3
15
26
63
...
151564852

I would like them to be represented as such (0-9A-Za-z only)

2
3
F
Q
z
...
vDF25a   //For example

The approach I thought of is to have 63 if statements which each of the mappings from 0-63 to 0-z respectively and for anything above 64 do a recursion on the value minus 63.

Needless to say, I think my approach is very flawed and impractical. What would be a more appropriate way of doing it?


Update:

Following fge's suggestion I've got the encoder to work correctly, however my decode function only works for up-to length 2 strings, in cases where the string is larger the sum becomes erroneous. For example for 3840 to 3845 this is the output

// Encoded 
zw
x
zy
zz
100

// Decoded
3840
3841
3842
3843
124         //Invalid decoding

Here is my code for the decode function

public static int decode(String value)
{
    String revStr = new StringBuilder(value).reverse().toString();

    int sum = 0;



    for (int i=1; i < revStr.length(); i++)
    {
        for (int j=0; j < ALPHABET.length; j++)
        {
            if (ALPHABET[j] == revStr.charAt(i))
            {
                sum += (ALPHABET.length * j) * i;
                break;
            }
        }
    }

    for (int j=0; j < ALPHABET.length; j++)
    {
        if (ALPHABET[j] == revStr.charAt(0))
        {
            sum += j;
            break;
        }
    }

    return sum;
}
pondigi
  • 856
  • 2
  • 8
  • 14
  • Well hex is something similar, but just using 0-9A-F ? Can you convert it to hex? – dkasipovic Feb 26 '14 at 13:57
  • @D.Kasipovic, valid point, but i would prefer my suggested encoding as it will reduce the length of an ID further. Thanks. – pondigi Feb 27 '14 at 12:25
  • Yes, but HEX is just base 16 numerical system where it the numbers greater than 9 are represented with letter. Similar, you could use base convert to 63 or 64, and represent numbers greater than 9 by letters. There is actually an example (would you believe), matching your question, at http://korn19.ch/coding/base_converter.php and I quote: `Base-2 to base-62 are accepted. "A" stands for 10, "Z" for 35, "a" (lower-case) for 36 and "z" (lower-case) for 61. Decimals are supported` – dkasipovic Feb 27 '14 at 12:32

4 Answers4

5

This is not base64; base64 encodes binary data.

Anyway, you don't need a s*load of if statements; use an array:

public final class AlphabetEncoder
{
    private static final char[] ALPHABET = { '0', '1', '2', ...., 'z' };
    private static final int ENCODE_LENGTH = ALPHABET.length;

    public static String encode(int victim)
    {
        final List<Character> list = new ArrayList<>();

        do {
            list.add(ALPHABET[victim % ENCODE_LENGTH]);
            victim /= ENCODE_LENGTH;
        } while (victim > 0);

        Collections.reverse(list);
        return new String(list.toArray(new char[list.size()],
            StandardCharsets.UTF_8);
    }

    public int decode(final String encoded)
    {
        int ret = 0;
        char c;
        for (int index = 0; index < encoded.length(); index++) {
            c = encoded.charAt(index);
            ret *= ENCODE_LENGTH;
            ret += Arrays.binarySearch(ALPHABET, c);
       }
       return ret;
    }
}

NOTE ABOUT THE DECODE FUNCTION: it is possible to use Arrays.binarySearch() here since the alphabet has the nice property of being naturally sorted (0 < 1 < 2 < ... < z). However, a test should probably be added that its return code not be negative!

Tim Hovius
  • 721
  • 6
  • 16
fge
  • 119,121
  • 33
  • 254
  • 329
  • thanks! Any chance you could see the update on my question, and possibly guide me as to where is my function wrong? :) Thanks – pondigi Feb 27 '14 at 12:26
  • 1
    Note: could be better. We could use a `List` instead of an array. The code in the decode function would then be MUCH simpler... – fge Feb 27 '14 at 12:43
  • 1
    @pondigi Look at the edit again, there is in fact a MUCH more simple way – fge Feb 27 '14 at 13:18
  • 1
    @pondigi before you thank me, check that the decode actually works... I wrote it "out of thin air", remember – fge Feb 27 '14 at 13:30
  • Very helpful! Here's a PHP version: https://gist.github.com/mattparlane/6b2b63aec32470b3b5ed27696e398f77 – Matt Parlane Jan 31 '18 at 13:42
4

You can refer to already existing logic for converting from Decimal [0-9] to Hexadecimal conversion present in Integer class and extend the logic for your Base 64 converison. Refer

Integer.toHexString(int i)

This maybe the efficient implementation for conversion.

Uday Shankar
  • 844
  • 7
  • 20
2

Depending on the language you use, there should already be a module which converts a string from/to base64. Check this other post: Base64 Encoding in Java

Community
  • 1
  • 1
ctutte
  • 131
  • 5
1

My thanks to the @fge answer. Using it with some changes I've get it to support much larger integers with BigInteger, added support of negative integers and changed Arrays.binarySearch() with HashMap.

BTW, it should be called base62 encoding because [0-9A-Za-z] contains just 62 chars.

public class Base62{

    private static final char[] ALPHABET = new char[ 62 ];
    private static final Map<Character, Integer> ALPHABET_MAPPING = new HashMap<>();
    private static final BigInteger ENCODE_LENGTH = BigInteger.valueOf( ALPHABET.length );

    static{
        int position = 0;
        // numbers
        for( int i = 48; i <= 57; i++ ){
            ALPHABET[ position++ ] = (char)i;
        }
        // uppercase letters
        for( int i = 65; i <= 90; i++ ){
            ALPHABET[ position++ ] = (char)i;
        }
        // lowercase letters
        for( int i = 97; i <= 122; i++ ){
            ALPHABET[ position++ ] = (char)i;
        }
        for( int i = 0; i < ALPHABET.length; i++ ){
            ALPHABET_MAPPING.put( ALPHABET[ i ], i );
        }
    }

    public static String encode( final BigInteger in ){
        final List<Character> list = new ArrayList<>();
        boolean negative = in.signum() == -1;
        BigInteger use;
        if( negative ){
            use = in.negate();
        } else {
            use = in;
        }

        do{
            BigInteger[] divisionResultAndReminder = use.divideAndRemainder( ENCODE_LENGTH );
            list.add( ALPHABET[ divisionResultAndReminder[ 1 ].intValue() ] );
            use = divisionResultAndReminder[ 0 ];
        } while( use.equals( BigInteger.ZERO ) == false );

        Collections.reverse( list );

        char[] res = new char[ list.size() ];
        for( int i = 0; i < list.size(); i++ ){
            res[ i ] = list.get( i );
        }

        return ( negative ? "-" : "" ) + new String( res );
    }

    public static BigInteger decode( final String encoded ){
        BigInteger res = BigInteger.ZERO;
        char c;
        boolean negative;
        String use;
        if( '-' == encoded.charAt( 0 ) ){
            negative = true;
            use = encoded.substring( 1 );
        } else {
            negative = false;
            use = encoded;
        }

        for( int index = 0; index < use.length(); index++ ){
            c = use.charAt( index );
            res = res.multiply( ENCODE_LENGTH );
            res = res.add( BigInteger.valueOf( ALPHABET_MAPPING.get( c ) ) );
        }
        return negative ? res.negate() : res;
    }
}
Igor
  • 608
  • 6
  • 11