6

I'm making a function that converts a number into a string with predefined characters. Original, I know. I started it, because it seemed fun at the time. To do on my own. Well, it's frustrating and not fun.

I want it to be like binary as in that any left character is worth more than its right neigbour. Binary is inefficient because every bit has only 1 positive value. Xnary is efficient, because a 'bit' is never 0.

The character set (in this case): A - Z.

A = 1 ..
Z = 26
AA = 27 ..
AZ = 52
BA = 53 ..
BZ = 2 * 26 (B) + 26 * 1 (Z) = 78... Right?
ZZ = 26 * 26 (Z) + 26 * 1 (Z) = 702?? Right??

I found this here, but there AA is the same as A and AAA. The result of the function is never AA or AAA.

The string A is different from AA and AAA however, so the number should be too. (Unlike binary 1, 01, 001 etc.) And since a longer string is always more valuable than a shorter... A < AA < AAA.

Does this make sense? I've tried to explain it before and have failed. I've also tried to make it before. =)

The most important thing: since A < AA < AAA, the value of 'my' ABC is higher than the value of the other script. Another difference: my script doesn't exist, because I keep failing.

I've tried with this algorithm:

N = 1000, Size = 3, (because 26 log(1000) = 2.x), so use 676, 26 and 1 for positions:
N = 1000
P0 = 1000 / 676 = 1.x = 1 = A
N = 1000 - 1 * 676 = 324
P1 = 324 / 26 = 12.x = 12 = L
N = 324 - 12 * 26 = 12
P1 = 12 / 1 = 12 = L
1000 => ALL

Sounds fair? Apparently it's crap. Because:

N = 158760, Size = 4, so use 17576, 676, 26 and 1
P0 = 158760 / 17576 = 9.x = 9 = I
N = 158760 - 9 * 17576 = 576
P1 = 576 / 676 = 0.x = 0 <<< OOPS

If 1 is A (the very first of the xnary), what's 0? Impossible is what it is.

So this one is a bust. The other one (on jsFiddle) is also a bust, because A != AA != AAA and that's a fact.

So what have I been missing for a few long nights?

Oh BTW: if you don't like numbers, don't read this.

PS. I've tried searching for similar questions but none are similar enough. The one references is most similar, but 'faulty' IMO.

Community
  • 1
  • 1
Rudie
  • 52,220
  • 42
  • 131
  • 173
  • Whenever you need to put 0 (zero), you can decrease left digit and place a Z instead of 0. Ex: To represent 52, you can start with B0. Then decrement B (to A), replace 0 with Z. Result AZ. Similarly, in your example you can put P0 = I-1 = H and P1 = Z. Things can get complicated if there are too many zeros need replacing, but follow the same strategy and it shouldn't be hard. – ElKamina Jan 10 '12 at 22:02

3 Answers3

4

Also known as Excel column numbering. It's easier if we shift by one, A = 0, ..., Z = 25, AA = 26, ..., at least for the calculations. For your scheme, all that's needed then is a subtraction of 1 before converting to Xnary resp. an addition after converting from.

So, with that modification, let's start finding the conversion. First, how many symbols do we need to encode n? Well, there are 26 one-digit numbers, 26^2 two-digit numbers, 26^3 three-digit numbers etc. So the total of numbers using at most d digits is 26^1 + 26^2 + ... + 26^d. That is the start of a geometric series, we know a closed form for the sum, 26*(26^d - 1)/(26-1). So to encode n, we need d digits if

26*(26^(d-1)-1)/25 <= n < 26*(26^d-1)/25   // remember, A = 0 takes one 'digit'

or

26^(d-1) <= (25*n)/26 + 1 < 26^d

That is, we need d(n) = floor(log_26(25*n/26+1)) + 1 digits to encode n >= 0. Now we must subtract the total of numbers needing at most d(n) - 1 digits to find the position of n in the d(n)-digit numbers, let's call it p(n) = n - 26*(26^(d(n)-1)-1)/25. And the encoding of n is then simply a d(n)-digit base-26 encoding of p(n).

The conversion in the other direction is then a base-26 expansion followed by an addition of 26*(26^(d-1) - 1)/25.

So for N = 1000, we encode n = 999, log_26(25*999/26+1) = log_26(961.5769...) = 2.x, we need 3 digits.

p(999) = 999 - 702 = 297
297 = 0*26^2 + 11*26 + 11
999 = ALL

For N = 158760, n = 158759 and log_26(25*158759/26+1) = 3.66..., we need four digits

p(158759) = 158759 - 18278 = 140481
140481 = 7*26^3 + 25*26^2 + 21*26 + 3
158759 = H        Z         V       D
Thomas Ahle
  • 30,774
  • 21
  • 92
  • 114
Daniel Fischer
  • 181,706
  • 17
  • 308
  • 431
3

This appears to be a very standard "implement conversion from base 10 to base N" where N happens to be 26, and you're using letters to represent all digits.

If you have A-Z as a 26ary value, you can represent 0 through (26 - 1) (like binary can represent 0 - (2 - 1).

BZ = 1 * 26 + 25 *1 = 51

The analogue would be:

19 = 1 * 10 + 9 * 1 (1/B being the first non-zero character, and 9/Z being the largest digit possible).

You basically have the right idea, but you need to shift it so A = 0, not A = 1. Then everything should work relatively sanely.

James
  • 8,512
  • 1
  • 26
  • 28
2

In the lengthy answer by @Daniel I see a call to log() which is a red flag for performance. Here is a simple way without much complex math:

function excelize(colNum) {
    var order = 0, sub = 0, divTmp = colNum;
    do {
        divTmp -= 26**order;
        sub += 26**order;
        divTmp = (divTmp - (divTmp % 26)) / 26;
        order++;
    } while(divTmp > 0);

    var symbols = "0123456789abcdefghijklmnopqrstuvwxyz";
    var tr = c => symbols[symbols.indexOf(c)+10];
    Number(colNum-sub).toString(26).split('').map(c=>tr(c)).join('');
}

Explanation:

Since this is not base26, we need to substract the base times order for each additional symbol ("digit"). So first we count the order of the resulting number, and at the same time count the substract. And then we convert it to base 26 and substract that, and then shift the symbols to A-Z instead of 0-P.

Ondra Žižka
  • 43,948
  • 41
  • 217
  • 277