1

Here is the problem, where I need to transform an ID (defined as a long integer) to a smaller alfanumeric identifier. The details are the following:

  • Each individual on the problem as an unique ID, a long integer of size 13 (something like 123123412341234).
  • I need to generate a smaller representation of this unique ID, a alfanumeric string, something like A1CB3X. The problem is that 5 or 6 character length will not be enough to represent such a large integer.
  • The new ID (eg A1CB3X) should be valid in a context where we know that only a small number of individuals are present (less than 500). The new ID should be unique within that small set of individuals.
  • The new ID (eg A1CB3X) should be the result of a calculation made over the original ID. This means that taking the original ID elsewhere and applying the same calculation, we should get the same new ID (eg A1CB3X).
  • This calculation should occur when the individual is added to the set, meaning that not all individuals belonging to that set will be know at that time.

Any directions on how to solve such a problem?

BrunoMarques
  • 557
  • 2
  • 11
  • 1
    Possible duplicate of [Convert a number to the shortest possible character string while retaining uniqueness](http://stackoverflow.com/questions/2557501/convert-a-number-to-the-shortest-possible-character-string-while-retaining-uniqu) – Martin Gottweis Jun 06 '16 at 11:55
  • Which characters are allowed to be used in the generated alfanumeric string? – MrSmith42 Jun 06 '16 at 12:29
  • It should easy to use for the individual, so something like A-Z and 0-9 would be great. – BrunoMarques Jun 06 '16 at 15:34

4 Answers4

1

Assuming that you don't need a formula that goes in both directions (which is impossible if you are reducing a 13-digit number to a 5 or 6-character alphanum string):

If you can have up to 6 alphanumeric characters that gives you 366 = 2,176,782,336 possibilities, assuming only numbers and uppercase letters.

To map your larger 13-digit number onto this space, you can take a modulo of some prime number slightly smaller than that, for example 2,176,782,317, the encode it with base-36 encoding.

alphanum_id = base36encode(longnumber_id % 2176782317)

For a set of 500, this gives you a

2176782317P500 / 2176782317500 chance of a collision

(P is permutation)

samgak
  • 23,944
  • 4
  • 60
  • 82
1

Best option is to change the base to 62 using case sensitive characters

If you want it to be shorter, you can add unicode characters. See below.

Here is javascript code for you: https://jsfiddle.net/vewmdt85/1/

function compress(n) {
    var symbols = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïð'.split('');
    var d = n;
    var compressed = '';

    while (d >= 1) {
        compressed = symbols[(d - (symbols.length * Math.floor(d / symbols.length)))] + compressed;
        d = Math.floor(d / symbols.length);
    }

    return compressed;
}

$('input').keyup(function() {
        $('span').html(compress($(this).val()))
})

$('span').html(compress($('input').val()))
Martin Gottweis
  • 2,721
  • 13
  • 27
0

How about using some base-X conversion, for example 123123412341234 becomes 17N644R7CI in base-36 and 9999999999999 becomes 3JLXPT2PR?

0

If you need a mapping that works both directions, you can simply go for a larger base.

Meaning: using base 16, you can reduce 1 to 16 to a single character. So, base36 is the "maximum" that allows for shorter strings (when 1-1 mapping is required)!

GhostCat
  • 137,827
  • 25
  • 176
  • 248