0

Given an integer input var i = 1234567890 based on the output of a hash function, I need to create a shortened, case sensitive alphanumeric string. This is for purposes of having a short URL with a case-sensitive hashed parameter such as http://example.com/?hash=1M0nlPk.

JavaScript's i.toString(36) would use characters 0-9 and a-z. That's part of the way to a solution. However, I need to operate on a character set of unknown or configureable length, e.g. 012abcABC. How could I convert an int to a string consisting only of this character set?

UPDATE: I have improved the wording of this question. It is not a duplicate of Fastest way to convert a number to radix 64 in JavaScript? because the character set is arbitrary. Some of the answers in that question may be adaptable to this question, but I believe it to be a fundamentally different question.

Community
  • 1
  • 1
ryanm
  • 2,239
  • 21
  • 31
  • That all sounds well and dandy. What have _you_ tried so far? – Oka Feb 25 '16 at 20:13
  • 3
    Possible duplicate of [Fastest way to convert a number to radix 64 in JavaScript?](http://stackoverflow.com/questions/6213227/fastest-way-to-convert-a-number-to-radix-64-in-javascript) – JJJ Feb 25 '16 at 20:15
  • @oka The question already includes one example of what I've tried. If you have an answer, I would be interested to hear it. – ryanm Feb 25 '16 at 20:20
  • OP indicates what they've tried before - `i.toString(36)`.... – random_user_name Feb 25 '16 at 20:25
  • Potentially the answer here http://stackoverflow.com/a/27696695/3232832 could be modified for use with an arbitrary-length string. – ryanm Feb 25 '16 at 20:55
  • @oka I have tried a variety of hash functions, the best our use results in an int output. I have tried the `i.toString(36)`, but it doesn't fit the bill because I'd like to create a shorter string with broader character set. I have also tried variations of `window.btoa` BASE64 - this gives me closer to the character set I want, but it is not as short as I'd like. Does that help? I could resolve the question myself, but I was hoping the community would help faster. At this point, the SO community seems hostile to this question, and that's no fun, so I am considering withdrawing it. – ryanm Feb 26 '16 at 17:18
  • @ryanm The issue here is that you really need to include all that information in the question. All your applicable attempts. You need to show some effort towards solving the _actual_ problem (being a base62 or base*N* encoding of integer values) on your own. Saying ''Oh, but I tried a base36 encoding, and that didn't give me what I need!" is not an acceptable way to ask for help. No one is trying to be hostile, it's just that your question is just asking _too much_ of folk. – Oka Feb 26 '16 at 18:38
  • @ryanm Myself and many others would happily help you when it comes to understanding where you've gone wrong in _implementing_ something, but we won't straight up write code from scratch for you. That's just not how this website should operate, in principle anyway. The other question linked here, and the answer you found in it, should give you plenty of jumping off points for going about implementing this. – Oka Feb 26 '16 at 18:40
  • @Oka Thank you for explaining, I think I understand: you were responding to my tone "I'd like a function" (worded poorly, no doubt) and I was responding to your tone "that's well and dandy". Looking at http://stackoverflow.com/questions/6213227/fastest-way-to-convert-a-number-to-radix-64-in-javascript, he has not given code, yet it is considered a great question and has great answers. http://stackoverflow.com/help/how-to-ask does not indicate you have to include code, but certainly I need to get better at asking. I thought it was a generally interesting question, which is why I posted. – ryanm Feb 26 '16 at 19:52
  • @ryanm I think the _topic_ is interesting, but it doesn't make for a very good 'broad' question, considering it is just a matter of implementation - the algorithm has already been shown/explained. I wouldn't describe the linked question as being asked particularly well either, but it's old, and we tend to overlook these things when it comes to older questions. If you still need help implementing this, check out some modules like [`base-n`](https://github.com/DonutEspresso/base-n), and study their source code. It really is just a matter of substituting in some N value in place of 64. – Oka Feb 28 '16 at 02:46
  • @Oka thanks for the clues to base-n libraries. I have answered my own question, hopefully in "good SO form" crediting original authors but extending the solution. – ryanm Feb 29 '16 at 13:37

1 Answers1

3

This is a variant of a "Base64" conversion question that can be answered by "base n" libraries. However, these libraries may be "overkill" for this question, so below is modified code based on a simple & elegant solution by @Reb.Cabin. Credit also to editors @callum, @Philip Kaplan, @Oka on this code.

In this response, vowels and various "problem letters" commonly used to create curse words are removed, so a random integer hash will not create an offensive short URL.

// Based on Base64 code by @Reb.Cabin, edits by @callum, @philip Kaplan, @Oka available at https://stackoverflow.com/a/6573119/3232832
BaseN = {
    _Rixits :
//   0       8       16      24      32      40      48      56     63
//   v       v       v       v       v       v       v       v      v
    "0123456789BDGHJKLMNPQRTVWXYZbdghjklmnpqrtvwxyz-_",
//  original base64
//  "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-_",
    // You have the freedom, here, to choose the glyphs you want for 
    // representing your base-64 numbers.
    // This cannot handle negative numbers and only works on the 
    //     integer part, discarding the fractional part.
    fromNumber : function(number) {
        if (isNaN(Number(number)) || number === null ||
            number === Number.POSITIVE_INFINITY)
            throw "The input is not valid";
        if (number < 0)
            throw "Can't represent negative numbers now";

        var rixit; // like 'digit', only in some non-decimal radix 
        var residual = Math.floor(number);
        var result = '';
        var rixitLen = this._Rixits.length;
        while (true) {
            rixit = residual % rixitLen;
            result = this._Rixits.charAt(rixit) + result;
            residual = Math.floor(residual / rixitLen);

            if (residual === 0)
                break;
            }
        return result;
    },

    toNumber : function(rixits) {
        var result = 0;
        for (var e = 0; e < rixits.length; e++) {
            result = (result * this._Rixits.length) + this._Rixits.indexOf(rixits[e]);
        }
        return result;
    }
};

var i = 1234567890;
var encoded = BaseN.fromNumber(1234567890);
var decoded = BaseN.toNumber(encoded);
document.writeln('Given character set "' + BaseN._Rixits + '", the number ' + i + ' is encoded to ' + encoded + ' then back again to ' + decoded + '.');
Community
  • 1
  • 1
ryanm
  • 2,239
  • 21
  • 31
  • Good stuff! There are some performance gains to be had by [precomputing the inverse](https://jsfiddle.net/5gfkc701/) of your set of characters for a mapped lookup, rather than using `indexOf` to traverse the set each time. You can also drop the string split since array and string iteration work the same, and use the `rixitLen` in the conditional part of the for loop. (My edit on the original answer was more about fixing some awful global leakage, and an ugly iteration, than performance.) Nice work putting the pieces in place, though. – Oka Feb 29 '16 at 19:32
  • @Oka thanks for the tips! I may loop back and integrate all your recommendations into this answer (I have a few ideas to tighten it up also), or feel free to edit directly if you'd like, no pressure. Cheers! – ryanm Mar 04 '16 at 16:46