2

I have read around 5-10 different posts on this subject and none of the give a clear example. they explain the backstory.

  1. i have a MySQL database with records from number "1" to "500000"
  2. I want the URLS to be based on these record ID numbers
  3. I want the URL to stay at a constant between 3-5 numbers

Example:

http://wwwurl.com/1 would be http://wwwurl.com/ASd234s again http://wwwurl.com/5000000 would be http://wwwurl.com/Y2v0R4r

Can I get a clear exmaple of a function code to make this work, thanks.

Yoshi
  • 54,081
  • 14
  • 89
  • 103
basickarl
  • 37,187
  • 64
  • 214
  • 335
  • 1
    I assume you don't want those "urls" to be predictable/sequential .. right? – Aziz Apr 24 '12 at 09:40
  • Another question: Do you want to store those mappings (number -> url) in the database? or do you want the urls to be calculated from the number using some function? – Aziz Apr 24 '12 at 09:42
  • A third question: what part are you facing difficulties with? is it the generation of those URLs and numbers? or is it the handling of the URL requests using PHP? – Aziz Apr 24 '12 at 09:43
  • 1
    You may want to look at some tutorials on how to create url-shorter in PHP. Here is a good one: http://devlup.com/programming/php/create-url-shortener-php/853/ – Aziz Apr 24 '12 at 11:00
  • But your examples have 7 characters? – symcbean Apr 24 '12 at 12:09
  • 1) Doesn't matter! 2) Yes stored! 3) Getting a randomized url that will not exceed 5chars! 4) Will do! 5) Was only an example! Want a youtube like url shortner so to speak! – basickarl May 18 '12 at 08:33
  • Thank you Aziz you answered my question with that link! – basickarl May 18 '12 at 08:39

2 Answers2

0

A very stupid example - use e.g. substr(md5($id), 10, 15), where $id is Your 1-500000 record ID number. The probability of generating the same hash between position 10 and 15 (but You can also use positions 24-28, etc) within a 32 char hashcode is limiting to zero...

It would be also better to save the mappings ID <-> HASH to a DB table to find the relevant record based on URL easily.

Whole source code - hash creation, URL rewriting, mappings saving and record retrieval based on URL is a very complex problematic that could be implemented in thousands variations and depends mainly on the programmer skills, experiences and also on the system he is implementing this into...

shadyyx
  • 15,825
  • 6
  • 60
  • 95
  • 1
    Using a small substring of MD5 (5 characters) will generate TOO MANY collisions. substring of 5 characters (hex digits) will have about 1M possibilities, and the question is suggesting 500K records (about half the number of possibilities). The collisions will just be too many. Check http://stackoverflow.com/questions/4681913/substr-md5-collision – Aziz Apr 24 '12 at 10:17
  • Of course this could be enhanced - I gave only a very primitive example. Though didn't know that the problematic of collision probability is so huge, gosh... Also some checks should be done when generating hashes that the actually generated hash is unique etc. – shadyyx Apr 24 '12 at 10:32
  • Or using of some TEA algorythm could help... Like here: http://en.wikipedia.org/wiki/Tiny_Encryption_Algorithm (and PHP implementation: http://www.php-einfach.de/sonstiges_generator_xtea.php). – shadyyx Apr 24 '12 at 10:36
0

To reduce the id number to a shorter string convert to base 35....

 $short_id=base_convert($id, 10, 35);

If you want to make it more difficult to predict what the sequence is, pad it out and xor with a known string:

 function shortcode($id)
 {
   $short_id=str_pad($short_id, 4, '0', STR_PAD_LEFT);
   $final='';
   $key='a59t'; // 500000 = 'bn5p'
   for ($x=0; $x<strlen($short_id); $x++) {
     $final=chr('0') | (ord(substr($short_id, $x, 1)) ^ ord(substr($key, $x, 1));
   }
   return $final;
 }

And to get the original id back, just reverse the process.

symcbean
  • 47,736
  • 6
  • 59
  • 94