3

I've encountered a nice base64 encoding implementations in JS, its typical job is to take a utf8-encoded text input and give a base64 output (and vice versa). But I'm surprised I've never seen a suitable solution for base32! Well, that's all I've found:
1. agnoster/base32-js. This is for nodejs, and its main base32.encode function takes input as a string.
2. base32-encoding-in-javascript. This takes input as a string, too. Moreover, this lacks for decoder.
But I need the script to take the input as HEX (or even base64)!!! If my input is Hex, then output will be shortened; if my input is base64, then, according to wikipedia, output will be 20% oversized - that's what I expect.
Given the alphabet "ABCDEFGHIJKLMNOPQRSTUVWXYZ234567":

hexdata: 12AB3412AB3412AB3412AB34;

//RFC 3548 chapter 5: The encoding process represents 40-bit groups of input bits 
//as output strings of 8 encoded characters. 

 bin     b32
00010 --> C
01010 --> K
10101 --> V
10011 --> T
01000 --> I
00100 --> E
10101 --> V
01011 --> L
//+40 bits
00110 --> G
10000 --> Q
01001 --> J
01010 --> K
10110 --> W
01101 --> N
00000 --> A
10010 --> S
//+16 bits
10101 --> V  //RFC 3548 chapter 5 case 3:
01100 --> M  //the final quantum of encoding input is exactly 16 bits; 
11010 --> 2  //here, the final unit of encoded output will be four characters 
0     -->    //followed by four "=" padding characters
//zero bits are added (on the right) to form an integral number of 5-bit groups
--> 
00000 --> A
--> base32data: CKVTIEVLGQJKWNASVM2A====  

I'd like to see javascript hextobase32("12AB3412AB3412AB3412AB34") yielding CKVTIEVLGQJKWNASVM2A==== and base32tohex("CKVTIEVLGQJKWNASVM2A====") returning 12AB3412AB3412AB3412AB34.

UPDATE
In addition to agnoster/base32-js, which doesn't seem to handle padding problems, I met the following libs:
1. Nibbler. According to wikipedia, there are two ways to encode: 8-bit and 7-bit. This lib even has an option dataBits (maybe it's meant only for base64, not for base32, I don't know) to choose 8-bit or 7-bit way! But this project is not evolving at all. And one more thing: reading comments, I see that this lib also has padding issues!
2. Chris Umbel thirty-two.js. This lib decided to carry the whole byte table with it for decoding. And you can see this interesting comment in the source code:
/* byte by byte isn't as pretty as quintet by quintet but tests a bit faster. will have to revisit. */
But not evolving.
3. jsliquid.Data. Operates on so-called binary large objects. Seems to get the job done, but since its code is heavily obfuscated, I can't even see how to define my custom alphabet.
And now, I think that a fully functional Javascript UTF8/hex/base32/base64 library of a reliable quality would be great, but currently, situation is dubious.

lyrically wicked
  • 1,185
  • 12
  • 26

1 Answers1

0

Well the first node.js takes input in binary string, what you want is for it take input in base-16 or base-64. Since you already have nice base-64 implementations and base16 decoder is dead simple to do, I think you're set.

https://github.com/agnoster/base32-js/blob/master/lib/base32.js Also works for browsers out of the box.

So you'd use it like this in browser:

var result = base32.encode(base64decode(base64input));
var result2 = base32.encode(base16decode(base16input));
var result3 = base32.encode(binaryInput);

Where base16decode:

function base16decode( str ) {
    return str.replace( /([A-fa-f0-9]{2})/g, function( m, g1 ) {
        return String.fromCharCode( parseInt( g1, 16 ));
    });
}

http://jsfiddle.net/YPuF3/1/

Esailija
  • 138,174
  • 23
  • 272
  • 326
  • Sorry, I don't understand. Base32-encoded data must be **SHORTER** than hex data! But using agnoster-base32, test this code: `alert(base32.encode("12AB3412AB3412AB3412AB3412AB3412AB3412AB3412AB3412AB34"))`, take a look at "64t42ghk6grk4ga26cu32cj188tk8c9j85136d1h690m4ctm64t42ghk6grk4ga26cu32cj188tk8c9j85136d0", which is **LARGER**, because the encoding function doesn't get its input as binary data (see `this.readByte = function(byte) { if (typeof byte == 'string') byte = byte.charCodeAt(0)...}` part in the source code of agnoster-base32 script) – lyrically wicked Jan 14 '13 at 03:58
  • 1
    @lyricallywicked See http://jsfiddle.net/YPuF3/ .... wait, is that supposed to be a hex string? The whole point of my answer is to use base16decode before passing base 16 input... http://jsfiddle.net/YPuF3/1/ – Esailija Jan 14 '13 at 04:06
  • In this example, `base32.encode("12AB3412AB34")` leads to `Base32 :64t42ghk6grk4ga26cu0`. This means that the data wasn't taken as binary, but I want the input to mean **HEX**! – lyrically wicked Jan 14 '13 at 04:16
  • 1
    @lyricallywicked for the third time, base16decode it before passing to base32 encode http://jsfiddle.net/YPuF3/2/ – Esailija Jan 14 '13 at 04:18
  • I'll dig into this later, but I see you use `parseInt`... I don't like it. Please tell me, will this code work with the string of **ANY** size, like this: `base32.encode( base16decode("12AB3412AB3412AB3412AB3412AB3412AB3412AB3412AB3412AB3412AB3412AB3412AB3412AB3412AB3412AB3412AB3412AB3412AB34"))`, what will the result look like? – lyrically wicked Jan 14 '13 at 04:38
  • @lyricallywicked [2ank84nb6g9apd0jncu15atm2ank84nb6g9apd0jncu15atm2ank84nb6g9apd0jncu15atm2ank84nb6g9apd0](http://jsfiddle.net/YPuF3/3/) I think I am wasting time here, you are obviously not listening at all. Everything you needed is already in the answer. Just take it. – Esailija Jan 14 '13 at 04:45
  • Yeah, yeah, sorry, I just couldn't test this right now... Yes, that's the correct result, thanks! – lyrically wicked Jan 14 '13 at 04:52
  • @lyricallywicked Please read the library, the first snippet of code is the alphabet that you can change (because there is so many correct ones for Base32)... change `var alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567'` – Esailija Jan 15 '13 at 03:56
  • I don't believe I forgot to look at the used alphabet, it's just my haste, sorry, sorry, sorry :( But one thing remains unclear: why doesn't this lib take care about padding chars? – lyrically wicked Jan 17 '13 at 02:42
  • For anyone interested, base64 padding is legacy cruft, and unnecessary in the algorithm. http://stackoverflow.com/questions/4322507/why-padding-is-used-in-base64-encoding – Michael Cole Feb 18 '15 at 00:28