1

I've been using this script for a long time and it works perfectly in 99%. It's easy and clear to users and I would like to continue using it.

However, once in a while a sparse user tells me that the system doesn't accept his captcha (wrong code) while the numbers are correct. Each time I've been going over their cookies settings, clearing cache etc, but in these cases nothing seems to work.

My question thus is, is there any reason in the code of this script that would explain malfunctioning in exceptional cases?

session_start();

$randomnr = rand(1000, 9999);
$_SESSION['randomnr2'] = md5($randomnr);

$im = imagecreatetruecolor(100, 28);
$white = imagecolorallocate($im, 255, 255, 255);
$grey = imagecolorallocate($im, 128, 128, 128);
$black = imagecolorallocate($im, 0,0,0);

imagefilledrectangle($im, 0, 0, 200, 35, $black);

$font = '/img/captcha/font.ttf';

imagettftext($im, 30, 0, 10, 40, $grey, $font, $randomnr);
imagettftext($im, 20, 3, 18, 25, $white, $font, $randomnr);

// Prevent caching
header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT");
header("Cache-Control: no-cache, must-revalidate"); // HTTP/1.1
header("Expires: Sat, 26 Jul 1997 05:00:00 GMT"); // Date in the past3
header("Cache-Control: post-check=0, pre-check=0", false);
header("Pragma: no-cache");

header ("Content-type: image/gif");

imagegif($im);
imagedestroy($im);

In my form, I then call this script as the source of the captcha image. After sending the form, the captcha is checked this way:

if(md5($_POST['norobot']) != $_SESSION['randomnr2']) {
    echo 'Wrong captcha!';
}

Please note that session_start(); is called on the form page and the form result page.

If anyone could pinpoint potential error causes in this script, I would appreciate it!

P.S.: I am aware of the drawbacks of captcha scripts. I am aware that certain bots could still read them out. I do not wish to use Recaptcha, because it is too difficult for my users (different language + lots of times older users). I also am aware of the fact that md5 is easily decryptable.


EDIT EDIT EDIT EDIT EDIT EDIT EDIT EDIT EDIT EDIT EDIT EDIT EDIT EDIT


Following the remarks of Ugo Méda, I've been doing some experiments. This is what I've created (simplified for your convenience):

The form

// Insert a random number of four digits into database, along with current time
$query   = 'INSERT INTO captcha (number, created_date, posted) VALUES ("'.rand(1000, 9999).'", NOW(),0)';
$result  = mysql_query($query);

// Retrieve the id of the inserted number
$captcha_uid = mysql_insert_id();

$output .= '<label for="norobot"> Enter spam protection code';
// Send id to captcha script
$output .= '<img src="/img/captcha/captcha.php?number='.$captcha_uid.'" />'; 
// Hidden field with id 
$output .= '<input type="hidden" name="captcha_uid" value="'.$captcha_uid.'" />'; 
$output .= '<input type="text" name="norobot" class="norobot" id="norobot" maxlength="4" required  />';
$output .= '</label>';

echo $output;

The captcha script

$font = '/img/captcha/font.ttf';

connect();
// Find the number associated to the captcha id
$query = 'SELECT number FROM captcha WHERE uid = "'.mysql_real_escape_string($_GET['number']).'" LIMIT 1';
$result = mysql_query($query) or trigger_error(__FUNCTION__.'<hr />'.mysql_error().'<hr />'.$query);
if (mysql_num_rows($result) != 0){          
    while($row = mysql_fetch_assoc($result)){
        $number = $row['number'];
    }
} 
disconnect();

$im     = imagecreatetruecolor(100, 28);
$white  = imagecolorallocate($im, 255, 255, 255);
$grey   = imagecolorallocate($im, 128, 128, 128);
$black  = imagecolorallocate($im, 0,0,0);

imagefilledrectangle($im, 0, 0, 200, 35, $black);
imagettftext($im, 30, 0, 10, 40, $grey, $font, $number);
imagettftext($im, 20, 3, 18, 25, $white, $font, $number);

// Generate the image from the number retrieved out of database
header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT");
header("Cache-Control: no-cache, must-revalidate"); // HTTP/1.1
header("Expires: Sat, 26 Jul 1997 05:00:00 GMT"); // Date in the past3
header("Cache-Control: post-check=0, pre-check=0", false);
header("Pragma: no-cache");
header ("Content-type: image/gif");

imagegif($im);
imagedestroy($im);

The result of the form

function get_captcha_number($captcha_uid) {
    $query = 'SELECT number FROM captcha WHERE uid = "'.mysql_real_escape_string($captcha_uid).'" LIMIT 1';
    $result = mysql_query($query);
    if (mysql_num_rows($result) != 0){          
        while($row = mysql_fetch_assoc($result)){
            return $row['number'];
        }
    } 
    // Here I would later also enter the DELETE QUERY mentioned above...
}
if($_POST['norobot'] != get_captcha_number($_POST['captcha_uid'])) {
    echo 'Captcha error'
    exit;
}

This works very well, so thanks very much for this solution.

However, I'm seeing some potential drawbacks here. I'm noting at least 4 queries and feels somewhat resource intensive for what we're doing. Also, when a user would reload the same page several times (just to be an asshole), the database would quickly fill up. Of course this would all be deleted upon the next form submit, but nonetheless, could you go over this possible alternative with me?

I'm aware that one should generally not encrypt / decrypt. However, since captchas are flawed by nature (because of image readouts of bots), couldn't we simplify the process by encrypting and decrypting a parameter that is being sent to the captcha.php script?

What if we did this (following the encrypt/decrypt instructions of Alix Axel):

1) Encrypt a random four digit character like so:

$key = 'encryption-password-only-present-within-the-application';
$string = rand(1000,9999);
$encrypted = base64_encode(mcrypt_encrypt(MCRYPT_RIJNDAEL_256, md5($key), $string, MCRYPT_MODE_CBC, md5(md5($key))));

2) Send the encrypted number with a parameter to the image script and store it in a hidden field

<img src="/img/captcha.php?number="'.$encrypted.'" />
<input type="hidden" name="encrypted_number" value="'.$encrypted.'" />

3) Decrypt the number (that was sent via $_GET) inside the captcha script and generate an image from it

$decrypted = rtrim(mcrypt_decrypt(MCRYPT_RIJNDAEL_256, md5($key), base64_decode($encrypted), MCRYPT_MODE_CBC, md5(md5($key))), "\0"); 

4) Decrypt the number on form submit again to compare to user input $decrypted = rtrim(mcrypt_decrypt(MCRYPT_RIJNDAEL_256, md5($key), base64_decode($encrypted), MCRYPT_MODE_CBC, md5(md5($key))), "\0");
if($_POST['norobot'] != $decrypted) { echo 'Captcha error!'; exit; }

Agreed, this is a little bit "security-through-obscurity", but it seems to provide some basic security and remains fairly simple. Or would this encrypt/decrypt action be too resource intensive on its own?

Does anyone have any remarks on this?

Community
  • 1
  • 1
chocolata
  • 3,258
  • 5
  • 31
  • 60
  • 1
    Are you sure that another process does not destroy session? – Leri Jul 05 '12 at 09:45
  • Good remark. I'm pretty sure, since in 99% of cases it works flawlessly (and it's always the same process). – chocolata Jul 05 '12 at 09:46
  • Is the problem reproducible at your visitors computer consistently i.e. you clean cookies and everything and it still happens ? – bisko Jul 05 '12 at 09:46
  • Yesterday a Safari user reported this problem. I cleared cache with them, checked if cookies were enabled, but the problem still persisted. I however cannot reproduce the error... – chocolata Jul 05 '12 at 09:48
  • It's clear that nothing is wrong with **this** part of code. I'd suggest to debug every line which is processed when things go wrong. – Leri Jul 05 '12 at 09:51
  • How would I go about debugging if I cannot reproduce the error and it works in 99% of cases? I thought the problem might lie in the definition of the headers. Are those all ok? – chocolata Jul 05 '12 at 09:53
  • They seem to be ok. Can you tell us if there are any similarities between users which say that captcha does not work? – Leri Jul 05 '12 at 09:57
  • Good question. There do not seem to be similarities. Last report was from Safari, before that it was Internet Explorer 8. No consistency. Might an antivirus cause problems, or doesn't this have anything to do with it? – chocolata Jul 05 '12 at 09:59
  • 1
    Are you logging the failed requests? i.e. What number you believe to have sent them and what they entered. Have you inspected this data? Perhaps you can add a timestamp to this form and the number generator to see if there is potentially a second request being performed after they are presented with the form but before they submit. – Rawkode Jul 05 '12 at 10:01
  • No, I am not loggin failed requests. Very good suggestion. I will build in some logging and inspect what happens. By the way, I just thought of something: What happens if the user opens two tabs of the same page? Would this overwrite the $_SESSION['randomnr2'] and cause errors? – chocolata Jul 05 '12 at 10:03
  • 1
    Yes, that is what I was hinting at with the second request ;-) – Rawkode Jul 05 '12 at 10:06
  • I just tested and indeed... Opening a second tab of the same page overwrites the captcha and yields an incorrect respons when the user fills in the original form. Not sure how to remedy this though. – chocolata Jul 05 '12 at 10:08
  • For your new solution, seems to be perfect. Just some things you could improve: if there is an error while creating the image, output the error as an image, or the user will never be able to read it; it's time to start using PDO ! – Ugo Méda Jul 09 '12 at 10:09
  • The second solution, using encryption, could be a perfect idea. But, it makes it possible to re-use a token and validate an infinite number of forms using always the same pair crypted chain-solution. So don't use this. – Ugo Méda Jul 09 '12 at 10:11
  • Also, if you want it to be a little more bulletproof, use a random value as ID. Let's say i want to be a pain in the ass, i would make a bot to open the form, get the current ID, and send forms continuously using the next IDs, to delete tokens in the database. Users trying to send the form will never be able to send it as their tokens are being deleted :D – Ugo Méda Jul 09 '12 at 10:15
  • Thanks for your expert views. Indeed it's time for PDO! Good idea to make the ID random. I see the pitfall. Could you explain how the same pair crypted chain loop would be possible? Is there any way to avoid this, or in your view, would implementing the database-solution be necessary? – chocolata Jul 09 '12 at 11:38
  • 1
    Once you've found the number on the image, you just have to modify the hidden input to reuse the crypted chain, the corresponding number will always be the same and the server will never be able to check wether it's been used before. – Ugo Méda Jul 10 '12 at 08:48
  • I understand. I'll try to catch this upon form submit. Is this train of thought too paranoid: in the second solution, if the user checks the parameter of the captcha image, and reads out the numbers on the captcha image and then uses this number and encrypted number again and again by modifying the form client-side (changing the parameter of the image and the hidden field, to accomodate the captcha he/she copied). Is this a little bit too paranoid? Because in that case your solution would be much more bulletproof. – chocolata Jul 10 '12 at 09:04

2 Answers2

3

Don't rely only on the SESSION value, for two reasons :

  • Your session can expire, so it won't work in some cases
  • If the user opens another tab with the same page, you'll have a weird behavior

Use some sort of token :

  • Generate a random ID when your output your form, put it in your database with the number expected (and the current date/time)
  • Generate your image using this ID
  • Add an hidden input in your form with the ID
  • When you receive your POST, fetch the expected value from the database and compare it
  • Delete this token and all the old tokens (WHERE token == %token AND datetime < DATE_SUB(NOW(), INTERVAL 1 HOUR) for instance)
Ugo Méda
  • 1,205
  • 8
  • 23
  • 1
    Clever bastard :-). It's clear to me know that relying on sessions for captcha purposes is not a good idea, because of the reasons you mentioned. I will try to create a fitting solution using your suggestions and get back to you post-haste! – chocolata Jul 05 '12 at 10:12
  • Hi, I've updated my answer with a worked out example of your suggestions. Also, I've presented a possible alternative on which I would like your views. – chocolata Jul 08 '12 at 17:28
  • Thank you for answering my question. I now have a working version of your solution. Still, could you have a look at my last comment about the other version? – chocolata Jul 10 '12 at 08:42
1

It sometimes happens that some visitors can be behind proxies or there is a plugin/software on their computer that can do double-request of some of the files. I have discovered this while developing a project of mine and had some Chrome plugin I have completely forgotten about.

As it is happening to so few of your visitors, it is possible that this is the case. Here are the steps I followed to debug the problem (keep in mind that this was a development environment and I was able to modify the code directly on the site):

When a visitor reports the problem, enable 'debugging' for them which means that I would add their IP to a debug array in the config of the captcha generator. This would do the following:

  1. Acquire the generation time of the image in microtime format.
  2. Write in a log file somewhere on the filesystem every request to the captcha page in a format similar to: ip|microtime|random_numbers
  3. Check the logs for the requests made by the user's IP address and see if there are any close requests in the ranges of about 10 seconds of each other. If there are, then there is something that is making a second request to your captcha page and it is generating a new code, which the visitor cannot see.

Also you need to make sure that after clearing the user's cache, the user is seeing different numbers at every refresh of the page. There can be a quirky behavior on the browser's end and it can be showing an old cached copy nevertheless (seen it on Firefox, you have to clear the cache, restart the browser, clear the cache again and then it works fine).

If this is the case you can do a simple time based addition to your script that does the following:

When generating a new captcha image, check if there is already a captcha numbers set in the session. If they are set, check what time they were generated and if it is less than let's say 10 seconds, just show the same numbers. If it is more than 10 seconds, show new numbers. The only caveat of this method is that you must unset the captcha variable in the session every time you use it.

An example code would be:

<?php

// begin generating captcha:

session_start();

if (
   empty($_SESSION['randomnr2']) // there is no captcha set
   || empty($_SESSION['randomnr2_time'])  // there is no time set
   || ( time() - $_SESSION['randomnr2_time']  > 10 ) // time is more than 10 secs
) {
   $randomnr = rand(1000, 9999);
   $_SESSION['randomnr2'] = md5($randomnr);
   $_SESSION['randomnr2_time'] = microtime(true); // this is the time it was 
                                                  // generated. You can use it 
                                                  // to write in the log file
}


// ...
?>
bisko
  • 3,948
  • 1
  • 27
  • 29