I have a bunch of items in my database. Each is assigned a unique ID. I want to shorten this ID and display it on the page, so that if I user needs to contact us (over the phone) regarding a particular item, he can give us the shortened ID, rather than a really big number. Similar to the SKU, on sites like NCIX. Thus, I was thinking about encoding it in base 36. The problem with that, however, is letters like 1lI
all look kind of the same. So, I was thinking about eliminating the look-alikes. Is this a good idea, or should I just use a really legible font?
-
I'm not sure I understand, you want 10231 to become IO23I? If you use a good reading font, the characters will be distinguishable. – Skurmedel Jun 21 '10 at 07:53
-
For what purpose do you want legible characters? If it's the password for a website then I would suggest that you don't spend too much time fretting, many people either log in once (copy+paste) and change it to their commonly used password, or have the browser remember it for them. Personally I randomly generate credentials for every new site and store them in an encrypted filesystem. There's over 130 in there... – MattH Jun 21 '10 at 08:50
-
@Skurmedel: Well, no... I'd argue that's less readable. Here... let me reword the question and explain it a bit better. – mpen Jun 22 '10 at 01:04
-
@MattH: I left that bit out in the first time 'round because I didn't want to bore everyone with a big story... but I guess it's important :) Updated the question. – mpen Jun 22 '10 at 01:12
-
1I think you totally misunderstood me, yes the obfuscated version is less readable. What I mean is that if reading the original identifier is a problem, the font is probably the culprit. The only reason I see to eliminate the look-alikes is if you want to be extra sure that it is not misunderstood if it is displayed somewhere out of your reach, like an email program or on a written piece of paper. – Skurmedel Jun 22 '10 at 10:19
-
@Skurmedel: Well, it could be. I'm thinking that people are going to write up their own invoices for things related to the site, and then they might want to scribble down the ID # to cross-reference it back to the site. Or maybe I should just leave it as an integer and forget about it :p I figure they'll only creep up to 5 digits after the first year and it'll be a few years before they hit 6. – mpen Jun 23 '10 at 01:36
4 Answers
Yes, you should eliminate sources of confusion. Because if a mistake can be made, someone will make it. Very easy to confuse 0 with O and I with l or 1 - hence should not use them both. Well that's easy - since you won't use 3 chars (i, L and o), just get the number in base 36-3 = 33 and convert
SKU.replace('I','X').replace('L','Y').replace('O','Z')
Inversely when given such code and before doing int(SKU, 33), you will have to return XYZ back to the confusing characters. Before that though, if - as expected - you are given by mistake L or I, replace with 1 and if given O, replace with 0. E.g. use SKU.translate() with
string.maketrans('LIOXYZ','110IL0')

- 28,347
- 6
- 48
- 67
-
-
That's a clever way of doing it. For a second I thought you didn't have a 1:1 mapping, but I guess that's why you subtracted the last 3 letters ;) – mpen Jul 08 '10 at 22:52
I'm assuming the original ID is numeric. We've had good results from z-base-32 with a similar scenario. We've been using it since April 2009.
I particularly liked the encoding's goals of minimizing transcription errors, through removing confusing letters from the alphabet, and brevity, as shorter identifiers are easier to use.
The encoding orders the alphabet so that the more commonly occurring characters are those that are easier to read, write, speak and remember. Lower case is used as it's easier to read.
I asked this similar question before we decided to use z-base-32.

- 1
- 1

- 15,027
- 4
- 37
- 40
-
Yes, it's a numeric ID. Didn't read through the whole paper, but it sounds promising and I agree with their decisions and goals. Might have to try and implement that later. – mpen Sep 19 '10 at 23:43
Use a legible font.

- 272,448
- 266
- 850
- 1,236
-
2Also consider using lowercase letters: `io` looks less like `10` than `IO` does. – dan04 Jun 24 '10 at 04:42
-
@dan04: sure, but `l` looks like `1`. There's no winning. Lowercase `i` is more distinguishable than uppercase `I`, but uppercase `L` is more distinguishable than lowercase `l`. – mpen Jul 08 '10 at 22:50
-
...but you told that you support your customers by phone. So it's not only a matter of fonts but also of character sound? ...or time: *Alfa, Bravo, Charli,...* ;-) – Wolf Jan 23 '14 at 15:39
-
@Wolf: Yeah.. having something they could click on to give them the phonetics they could repeat over the phone might help. This project fell through, so.... it never really mattered in the end. – mpen Jan 23 '14 at 16:36
We had a similar situation in a regular app many years ago, at a company I worked for. There was an ID, base 36 (0-9a-z) that often had to be communicated over the phone. That was an application running on a Unix server and viewed on serial terminals (not relevant, just part of the story :).
Our solution was that whenever the user was on that field and pressed F2, a small window popped-up having the radio code for the field: “a9vg5” would display “alpha niner victor golf five”, which the user would just read aloud.
When the application was developed, I had the inclination to display the ID as base 64 encoded, with capitals plus dot and slash, and use different radio-code words for the capitals, but the designated analyst disagreed. You could look-up different words in Wikipedia or be creative.
PS a clarification: although it's not clear the way I wrote it, the analyst disagreed with a good reason, since one has to think both sides of the communication; the user just reads, but the other side on the phone has to remember or look up that e.g. delta==d and Dalton==D.

- 92,761
- 29
- 141
- 204
-
A neat solution... but I'm thinking about our users might want to communicate these codes to each other too. Our site has "shipments" which they might want to pair with internal invoices... so they might have to communicate that with their accountant, or... maybe with another party under some circumstances. Giving them a table like this would confuse the heck out of them, I'm sure :) – mpen Jul 21 '10 at 14:13