16

This is nearly a cross post from Math SE - while the explanation of my problem is identical, on Math.SE I was asking for a mathematical solution to my problem.

My issue is, the solution I got on Math.SE was "convert to base 35" which is probably a very good answer, but I am truly horrible with math and don't understand how to apply the solution in my code. I tried looking up a lesson on converting to different bases, and it's pretty confusing to me. Even looking at a question about converting numbers to bases in JavaScript didn't make it clear how exactly I'd use it for what I need to do.

Is there a simple way to handle this in JavaScript? Here's the question in full:

I have an unusual programming problem and the math side of it has me stumped.

I've generated a unique string of seven characters which are each randomly selected from these possibilities: ABCDEFGHIJKLMNOPQRSTUVWXYZ123456789 for example A6HJ92B and I need to convert it to a unique number value. When converted, no two versions of this random string can be the name number.

I could just generate a number rather than including letters in the original id, but of course that means I have to increase the length of my string, and it's possible that the user of my application may want to type this string, as it identifies his "session" in an application, so I want to keep it short.

So my idea was to build a table like this:

A : 1,
B : 2,
C : 3,
D : 4,
E : 5,
F : 6,
G : 7,
H : 8,

... you get the idea ...

5 : 31,
6 : 32,
7 : 33,
8 : 34,
9 : 35

And then I'd add all of the numbers up...

A6HJ92B:

A : 1
6 : 32
H : 8
J : 10
9 : 35
2 : 28
B : 2

1+32+8+10+35+28+2 = 116

...but I realized this is a flawed idea because many possible strings will "collide" or equal the same number. I need each unique string to equal a unique number.

So even if I multiplied each character's value (1*32*8*10*35*28*2 = 5,017,600), I'm thinking there might be possible collisions there too.

Is there a way to calculate this in a way that eliminates collisions? If the collisions cant be eliminated, what methods can I use to minimize them?

Community
  • 1
  • 1
  • How about the bignumber.js? https://mikemcl.github.io/bignumber.js/ Made a quick fiddle: https://jsfiddle.net/7u0c2ddp/ – Gjermund Dahl Mar 05 '16 at 23:42
  • There is an infinite number ways of such conversion. The simplest way is: You already have to represent your string in a memory as zeros and ones. Just look at it as a long binary number. – Ivan Kuckir Mar 05 '16 at 23:57
  • @IvanKuckir Given that there is a finite set of characters, the strings have a finite number of characters, and in JS numbers are stored using a finite amount of memory, I don't think there are infinite ways to do the conversion. There are lots of them indeed, though. – Oriol Mar 06 '16 at 00:03
  • All methods below lead to long numbers too. E.g. there are 256 * 256 two-character strings, you must use 256*256 different numbers just to represent 2-character strings, which leads to mumbers 0, 1, 2, ... 65535. – Ivan Kuckir Mar 06 '16 at 00:04
  • @Oriol There is an infinite number of strings ("a", "aa", "aaa" ...). You can not represent all of them inside a classic Javascript Number (64-bit float). And lets say you have some conversion. You can always attach 0 to the output number and get another way of conversion, you can atach two zeros, three zeros ... and here you get infinitely many ways. – Ivan Kuckir Mar 06 '16 at 00:06
  • 1
    @IvanKuckir In ES6, [strings](http://www.ecma-international.org/ecma-262/6.0/#sec-6.1.4) have a maximum of 2⁵³-1 characters, so there is a finite number of them. It would still be impossible to map them injectively to numbers, but the question restricts strings to 7 characters. – Oriol Mar 06 '16 at 00:14
  • I +1'd everyone, thanks for all of your answers. –  Mar 06 '16 at 01:28

4 Answers4

23

Basically, you want an injective transformation f : S → N, where S is the set of JS strings of length 7 with characters A-Z1-9, and N is the set of all JS numbers.

A possible approach is considering that the strings in S are a positional encoding of numbers, as you attempted.

However, in order to be injective (avoid collisions), you should multiply the value of each character by the basis to the power of the position.

For example, given the following table of character values

0 ⟶  0
1 ⟶  1
⋮      ⋮
9 ⟶  9
A ⟶ 10
B ⟶ 11
⋮      ⋮
Z ⟶ 35

A6HJ92B would become 10×36⁶ + 6×36⁵ + 17×36⁴ + 19×36³ + 9×36² + 2×36 + 11, that is, 22160072099.

You can do the conversion easily with parseInt and toString:

parseInt('A6HJ92B', 36); // 22160072099
(22160072099).toString(36).toUpperCase(); // "A6HJ92B"

If you want to use an arbitrary table of values, you will have to code the transformations manually.

Note that in JS, numbers are double-precision floats of 64 bits. That means there is a finite precision, and you can't store arbitrarily big integers. It won't work properly beyond this maximum

Number.MAX_SAFE_INTEGER; // 9007199254740991
Number.MAX_SAFE_INTEGER.toString(36).toUpperCase(); // "2GOSA7PA2GV"

But since your strings only have 7 characters, it should be enough.

Oriol
  • 274,082
  • 63
  • 437
  • 513
  • By the way guys, sorry to complicate things by leaving out the zero. The reason was a practical one: Since users (think your mother or father) who might not know the copy/paste function may decide to type this session id to share it in some way, I left out the zero so it wouldnt be confused with an O. It's not a requirement, but good for user experience. I also have a "copy" button so it really isnt that important. –  Mar 06 '16 at 01:13
  • Wow this makes it very simple. This worked for me. I just didnt realize `parseInt` would handle the conversion to numbers that simply... –  Mar 06 '16 at 01:27
  • @Viziionary For what it's worth, I think the matter of excluding easily-confused characters would make a decent separate followup question. (Perhaps one that already exists here.) Basically, you could normalize the string by replacing zeros with O's (or vice-versa), ones with I's, etc. before converting it into a number and, optionally, also after converting _from_ a number. – David Z Mar 06 '16 at 09:14
  • @DavidZ I dont understand what the issue with the missing characters is, since the accepted answer covers the list being in any form, fundamentally speaking –  Mar 06 '16 at 11:43
  • @Viziionary well, you mentioned earlier in the comments that you wanted to avoid using zero because it's easily confused with O. I was just thinking that, since this answer only explains how to convert strings to integers and back using the full range of digits and uppercase letters, there might be an interesting followup question about how to modify that procedure to avoid some of the 36 characters. It could be a self-answered question, even. Of course, there's no need to post it. – David Z Mar 06 '16 at 12:00
  • @cFreed as you changed your mind, youd could delete your comments that are no more pertinent and avoid embarrassing yourself – edc65 Mar 06 '16 at 14:15
6

Essentially what you want is the equivalent of Java's hashcode. This is the formula used in Java: String hashcode formula where n is the total length of the string, and s[i] is the ith character of the string. Basically what you're doing is:

For each character in the string, multiply it by 31 raised to the power of the difference between the total length of the string and the index of the current character in the string minus 1.

ubadub
  • 3,571
  • 21
  • 32
5

You're just not thinking about the problem the right way. Let's start with something you are more familiar with; decimal, or base 10 numbers.

The number 7158 is composed of:

  7 x 10 ^ 3
 +1 x 10 ^ 2
 +5 x 10 ^ 1
 +8 x 10 ^ 0

To represent the number in base X you replace the 10 with X.

So now the question becomes how to convert between bases. The simplest solution is to divide by X repeatedly - each time convert the remainder to base X and add it to the front of your output string until there's nothing left, then use the integer part of the division for the next iteration:

7158 /35 = 204 remainder 18
204 / 35 = 5 remainder 29
5 / 35 = 0 remainder 5

So in base 35, the decimal number 7158 is represented by the symbols you choose for [5][29][18]

If you try the example above yourself by multiplying the fractional part of each division by 35 you may get some results which are not integers - computers work in binary (and are automatically doing the base conversion) and only work with a fixed number of digits - I.e. depending on how you do the calculation you may need to round of by around 0.00001 to get the remainder as an integer.

symcbean
  • 47,736
  • 6
  • 59
  • 94
  • @cFreed I think, since the title was "Is there a simple way to convert a unique string of characters into a unique number in javascript?" and the point of SO questions is to be useful to all people everywhere who might have a similar problem, answers that generally address this problem are ok. Hopefully fundamentally I can use this and other answers to figure out the solution with my specific character set. But I wouldn't want to force it to be so specific that the question and answers became useful only to me. –  Mar 06 '16 at 01:22
  • @Viziionary I agree (look at comments under Oriol's answer). It's whay I deleted my comments under the other answers too. – cFreed Mar 06 '16 at 01:39
4

You could multiply by the position power:

function StringAdder(obj){
  this.vals = obj;
  this.add = function(str){
    var s = str.split(''), l = s.length, p = Math.pow(10, (l-1)), r = 0;
    for(var i=0; i<l; i++){
      r += this.vals[s[i]]*p; p = p/10;
    }
    return r;
  }
}
var vals = {
  A: 1,
  6: 32,
  H: 8,
  J: 10,
  9: 35,
  2: 28,
  B: 2
}
var sa = new StringAdder(vals);
console.log(sa.add('A6HJ92B')); console.log(sa.add('HJA6B29'));
StackSlave
  • 10,613
  • 2
  • 18
  • 35