5

I want to use base36 in a web application I am developing... but as the id is visible to users as a url, I want to filter out profanity. Has anyone solved this? Or is this even a real problem?

Does it make sense just to skip numbers in my database sequence?

Owen Blacker
  • 4,117
  • 2
  • 33
  • 70
Chris
  • 591
  • 1
  • 5
  • 10
  • 5
    Obligatory TDWTF reference: [The Automated Curse Generator](http://thedailywtf.com/Articles/The-Automated-Curse-Generator.aspx). – R. Martinho Fernandes Feb 17 '11 at 17:26
  • 4
    YOu mean by chance a base36 ID might come up as `page.php?id=screwyou3234`? – Marc B Feb 17 '11 at 17:26
  • If I'm following you are passing your id in base36 in the url, right ? Well, if you don't want profanity in your id, why don't you just pass it as base10 and convert when necessary ? Besides that, do you have a reason to work in base36 ? Most system (i.e. databases, middle layers, etc) expect base 10 (Well, a notation in base 10 anyway, it's all 0's and 1's internally). – Sem Vanmeenen Feb 17 '11 at 17:29
  • @SemVanmeenen I am using the shorted codes as users will sometimes need to enter the id into a text message sent to the website. – Chris Feb 18 '11 at 19:22

2 Answers2

9

Well, rather than try to amass all the swear words possible, just filter out the vowels. That'll leave you plenty of permutations in the space. Admittedly, you've just cut down from base 36 to base 31, but base 31 numbers are valid base 36 numbers assuming the same symbol set (a-z0-9). IF that bothers you, replace the five vowels with some other non-magic 7-bit ascii like !,@,$,% and (.

Granted, you may end up with sh1t and fck, but the profanity is in the mind of the reader.

Script47
  • 14,230
  • 4
  • 45
  • 66
x0n
  • 51,312
  • 7
  • 89
  • 111
2

Why not just use a full-on randomly generated GUID in hexadecimal? No matter what programming language you're working in, this should be easy to generate. And being represented in hexadecimal, I would imagine the chances of generating something that upsets the easily offendable approach zero.

nickwesselman
  • 6,845
  • 2
  • 27
  • 46
  • Because it's ugly and not-short :P Although even a *much smaller* number still has 'offend-ability that approaches zero'. –  Feb 17 '11 at 18:26
  • If you're actually concerned about ugly URLs, why not use SEO-friendly URLs? e.g. wordpress-style slugs? – nickwesselman Feb 17 '11 at 20:19
  • I need to have short id's as some users will send text-messages to the site using the id's. – Chris Feb 18 '11 at 19:21