2

Am building a PHP site where the URL structure is like this:

http://www.domain.com/list/X/

X can be any integer from 1 up to ~50m (the number of records we have).

Due to the nature of the site, there is an internal need to make it difficult for people to automate extractions of the data by incrementing the URLs like thus:

http://www.domain.com/list/1/
http://www.domain.com/list/2/
http://www.domain.com/list/3/
...
http://www.domain.com/list/50000000/

So I was thinking of replacing X with some sort of random string, and then doing an internal lookup in the backed to retrieve the record integer ID.

What I thought of doing at first was to build a table with 50m rows that maps integers to a random 12-character string.

But to keep things more efficient, I thought of appending the integer to a private key, encrypting that and using the encrypted string in place of X. Then I only need to decrypt the string to retrieve the integer.

Can anyone recommend a method to do this in PHP that is:

1) Fast
2) Produces URL-friendly characters
3) Results in a short-ish string

I am not concerned with watertight security (it's just to deter hobbyists).

Thanks!

R C
  • 441
  • 4
  • 10
  • 1
    Have you even made an attempt at implementing it yourself in sandbox? – Daryl Gill Oct 21 '15 at 09:43
  • No, it's all in my head at the moment. The actual code will be extremely minimal (just a few lines). I was just looking for advice on a recommended method, whether something like `mcrypt` or similar. – R C Oct 21 '15 at 09:45

3 Answers3

5

Don't overcomplicate it. Your current ids are guessable, fix that by replacing them with unguessable ids. You get unguessable ids by generating truly random numbers; not obfuscating existing non-random numbers. UUIDs are perfect for this purpose. Simply add a new column in your records table which stores such a UUID; perhaps even consider replacing your integer IDs with UUIDs outright.

PECL offers a uuid package, but there are other pure PHP implementations as well.
You may also simply generate a random value from openssl_random_pseudo_bytes and bin2hex or base64_encode it.

deceze
  • 510,633
  • 85
  • 743
  • 889
  • The problem with this in my case is that the supplier of the source data uses an integer ID, and everytime there is an update to the source data of any kind, the original integer ID remains constant and is our sole anchor for maintaining data consistency. – R C Oct 21 '15 at 09:50
  • Well then, keep both ids around, that shouldn't be any issue. – deceze Oct 21 '15 at 09:50
  • Which means I still have to map an integer to a randomly-generated string, which was my first MO as per my question above. :) I was thinking that it would be easier to have one random string (a private key), and just use that to encrypt/decrypt, without having to have two whole columns of 50m IDs. – R C Oct 21 '15 at 09:51
  • Generating those 50m ids is just a one-time operation, and from there it's all very simple and foolproof. Depending on your database, it may support UUIDs natively and can generate those very easily and quickly. Even if you have to go through PHP to update each record, that shouldn't be such a big deal at all. – deceze Oct 21 '15 at 09:55
1

Due to the nature of the site, there is an internal need to make it difficult for people to automate extractions of the data by incrementing the URLs

Most importantly, use access controls and rate-limiting.

What I thought of doing at first was to build a table with 50m rows that maps integers to a random 12-character string.

Good idea. I highly recommend that.

But to keep things more efficient, I thought of appending the integer to a private key, encrypting that and using the encrypted string in place of X.

Not only is that not more efficient, you're increasing the attack surface of your application for very little benefit. Read the comprehensive guide to encrypting URL parameters for a detailed explanation.

Scott Arciszewski
  • 33,610
  • 16
  • 89
  • 206
-1

EDIT: DO NOT USE THIS IT IS INSECURE

I have used this in the past, i think i originally found it here on stackoverflow somewhere.

class EncryptClass
{
    private $crypt_password;# = 'SOME PASSWORD';
    private $crypt_salt = 'SOME SALT';


    public function password($pWord)
    {
        $this->crypt_password = $pWord;
    return true;
    }

    public function encrypt($decrypted) 
    { 
        $pass = $this->crypt_password;
        $salt = $this->crypt_salt;

        // Build a 256-bit $key which is a SHA256 hash of $salt and $password.
        $key = hash('SHA256', $salt . $pass, true);
        // Build $iv and $iv_base64.  We use a block size of 128 bits (AES compliant) and CBC mode.  (Note: ECB mode is inadequate as IV is not used.)
        srand(); $iv = mcrypt_create_iv(mcrypt_get_iv_size(MCRYPT_RIJNDAEL_128, MCRYPT_MODE_CBC), MCRYPT_RAND);
        if (strlen($iv_base64 = rtrim(base64_encode($iv), '=')) != 22) return false;
        // Encrypt $decrypted and an MD5 of $decrypted using $key.  MD5 is fine to use here because it's just to verify successful decryption.
        $encrypted = base64_encode(mcrypt_encrypt(MCRYPT_RIJNDAEL_128, $key, $decrypted . md5($decrypted), MCRYPT_MODE_CBC, $iv));
        // We're done!
        return urlencode($iv_base64 . $encrypted);
    } 

    public function decrypt($encrypted) 
    {
        $pass = $this->crypt_password;
        $salt = $this->crypt_salt;

        $encrypted = rawurldecode($encrypted);

        // Build a 256-bit $key which is a SHA256 hash of $salt and $password.
        $key = hash('SHA256', $salt . $pass, true);
        // Retrieve $iv which is the first 22 characters plus ==, base64_decoded.
        $iv = base64_decode(substr($encrypted, 0, 22) . '==');
        // Remove $iv from $encrypted.
        $encrypted = substr($encrypted, 22);
        // Decrypt the data.  rtrim won't corrupt the data because the last 32 characters are the md5 hash; thus any \0 character has to be padding.
        $decrypted = rtrim(mcrypt_decrypt(MCRYPT_RIJNDAEL_128, $key, base64_decode($encrypted), MCRYPT_MODE_CBC, $iv), "\0\4");
        // Retrieve $hash which is the last 32 characters of $decrypted.
        $hash = substr($decrypted, -32);
        // Remove the last 32 characters from $decrypted.
        $decrypted = substr($decrypted, 0, -32);
        // Integrity check.  If this fails, either the data is corrupted, or the password/salt was incorrect.
        if (md5($decrypted) != $hash) return false;
        // Yay!
        return $decrypted;
    }

}//class crypt

Then use it like this:

$cr = new EncryptClass;
$cr->password('A PASSWORD');

// to encrypt
$crypt = $cr->encrypt('The thing you want to crypt');

// to decrypt
$decrypt = $cr->decrypt('The thing you want to decrypt'); 

Hope it helps.

Joe
  • 2,981
  • 1
  • 16
  • 17
  • Please don't use this code. `MCRYPT_RAND` is a blight (use urandom!), hash-then-encrypt is actually worse than MAC-then-encrypt, but you want encrypt-then-MAC. Also, you aren't authenticating the (predictable) IV. Look at what [defuse/php-encryption](https://github.com/defuse/php-encryption) does instead. – Scott Arciszewski Oct 21 '15 at 15:25
  • (Also, as I said in my answer, encryption is not the right tool for this job anyway. Your code just happens to be insecure.) – Scott Arciszewski Oct 21 '15 at 15:40
  • 2
    Dam, Painful down vote. But it taught me a lesson. Be real sure about the code before you post it as an answer! I thought it was ok. Thanks for the "defuse/php-encryption" tip. What makes the current class insecure? – Joe Oct 21 '15 at 16:04
  • http://www.thoughtcrime.org/blog/the-cryptographic-doom-principle/ + using a predictable IV for CBC mode -> active attackers can trivially tamper with the encryption to produce invalid results – Scott Arciszewski Oct 21 '15 at 16:07