36

By "honeypot", I mean more or less this practice:

#Register form
<style>
    .hideme{
        display:none;
        visibility: hidden;
    }
</style>
<form action="register.php">
    Your email: <input type="text" name="u-email" />
    Choose a password: <input type="text" name="passwd" />
    <div class="hideme">
        Please, leave this field blank: <input type="text" name="email" />  #the comment is for text-browser users
    </div>
    <input type="submit" value="Register" autocomplete=off />
</form>

//register.php
<?php
if($_POST['email'] != ''){
    die("You spammer!");
}
//otherwise, do the form validation and go on.
?>

more info here.

Obviously, the real fields are named with random hashes, and the honeypot fields can have different names (email, user, website, homepage, etc..) that a spambot usually fills in.

I love this technique because it doesn't cause the user to be annoyed by CAPTCHA.

Do any of you have some experience with this technique? Is it effective?

Philosophist
  • 103
  • 1
  • 13
Strae
  • 18,807
  • 29
  • 92
  • 131
  • 7
    Be careful of your field names when doing something like this. There are multiple automated form-fillers out there and something meant to bait a spam bot might also bait a form filler. You try the form as given on me and you're going to call me a spammer--I will have no idea my system filled in the hidden "email" field. – Loren Pechtel Sep 01 '10 at 22:15
  • Youre right, i forget the `AUTOCOMPLETE=OFF` attribute in the honey field; however it is not supported by all the browser – Strae Sep 02 '10 at 07:34
  • 1
    Related : http://stackoverflow.com/questions/1577918/blocking-comment-spam-without-using-captcha Lists a lot of bot/validation techniques like CAPTCHA, honey pot, askimet, etc etc. If your having trouble with spambots, definitely worth a read. – rlb.usa Sep 29 '10 at 19:51
  • 1
    Related: [Better Honeypot Implementation (Form Anti-Spam)](http://stackoverflow.com/questions/36227376/better-honeypot-implementation-form-anti-spam/36227377) – Nicholas Summers Mar 31 '16 at 20:09

4 Answers4

23

Old question, but I thought I'd chime in, as I've been maintaining a module for Drupal (Honeypot), which uses the Honeypot spam prevention method alongside a time-based protection (users can't submit form in less than X seconds, and X increases exponentially with each consecutive failed submission). Using these two methods, I have heard of many, many sites (examples) that have eliminated almost all automated spam.

I have had better success with Honeypot + timestamp than I have with any CAPTCHA-based solution, because not only am I blocking most spammers, I'm also not punishing my users.

geerlingguy
  • 4,682
  • 8
  • 56
  • 92
11

With below technique, I block 100% of spams.

  1. honeypot with display:none. if failed, run extra script to collect IP address and write it in .htaccess file on deny from line.
  2. count number of URL on comment field. if failed, warn only because this can be human.
  3. measure the time to post. if less than 5 sec, show error message and let them try again because human can write pretty fast with auto-filling plugin.
  4. trim htaccess file dailly with crontab so deny lines won't go over 30 lines (adjust accordingly).

Deny access with IP address is very effective because bots keep trying to sneak in with same IPs (if they change IP then I put that new IP on htaccess so no problem). I trim .htaccess file daily with crontab automatically so the file won't be too big. I adjust the number of IP to block so same bot with same IP will be blocked for about a week or so. I noticed that same IP is used by bot for 3 days attacking several times.

The first #1 trick blocks about 99% and #2 blocks about 1% and the bot won't go through those 2 so #3 might not be necessary.

mowmow-guest
  • 111
  • 1
  • 2
  • Above routine seems to actually increased bot trials. I guess bots going crazy because they are denied to access to the server entirely. I think they are trying to fix how to fill in the form but their access is denied from 2nd time they access so they can't tell what went wrong. I'm hoping they will be dscouraged and stop trying when time goes by. – mowmow-guest Aug 14 '14 at 05:26
  • 1
    "honeypot with display:none. if failed..." What does that mean, if display: none fails. You mean if the bot still submits? Then how do you know it's a bot and programatically add it to .htaccess? I'm either confused or this is manual work. – Goose Jan 26 '17 at 16:43
  • 1
    @Goose If the bot fills in the hidden honeypot field, add it to the IP blocklist. (I strongly advise pairing this with `autocomplete="off"` so legitimate users of form auto-fill plugins don't risk getting blocked.) – ssokolow Jul 13 '19 at 06:46
8

I've used the honeypot captcha on three forms since about 2010, and it's been stunningly effective with no modifications until very recently. We've just made some changes that we think will stop most of the spambots, at least until they get more sophisticated. In broad strokes, here's the way we've set it up:

One input field on each form is hidden (display:none specified in the CSS class attribute) with a default value of "". For screen readers and such, the hidden input label makes it clear that the field must be left empty. Having no length at all by default, we use code server-side (ColdFusion in our case, but it could be any language) to stop the form submission if anything at all is in that field. When we interrupt the submission that way, we give the same user feedback as if it was successful ("Thank you for your comment" or something similar), so there is no outward indication of failure.

But over time, the bots wised up and the simplest of our forms was getting hammered with spam. The forms with front-end validation held up well, and I suppose that's because they also don't accept just any old text input, but require an email address to be structured like an email address, and so on. The one form that proved vulnerable had only a text input for comments and two optional inputs for contact information (phone number and email); importantly, I think, none of those inputs included front-end validation.

It will be easy enough to add that validation, and we'll do that soon. For now, though, we've added what others have suggested in the way of a "time trap." We set a time variable when the page loads and compare that timestamp to the time the form is submitted. At the moment we're allowing submission after 10 seconds on the page, though some people have suggested three seconds. We'll make adustments as needed. I want to see what effect this alone has on the spam traffic before adding the front-end validation.

So the quick summary of my experience is this: The honeypot works pretty well as it was originally conceived. (I don't recall where I found it first, but this post is very similar to the first I saw about it more than a decade ago.) It seems even more effective with the addition of client-side validation enabled by HTML5. And we think it will be even better with the server-side limits we've now imposed on those too-hasty submissions.

Lastly, I'll mention that solutions like reCaptcha are off the table for us. We spent significant time developing a web app using Google's map API, and it worked great until Google changed their API without warning and without transition advice. We won't marry the same abusive spouse twice.

Jeff Seager
  • 91
  • 1
  • 4
5

It works relatively well, however, if the bot creator caters to your page they will see that (or even have a routine setup to check) and will most likely modify their bot accordingly.

My preference is to use reCaptcha. But the above will stop some bots.

Bert H
  • 1,087
  • 1
  • 15
  • 29
Jim
  • 18,673
  • 5
  • 49
  • 65
  • 4
    A lot of bots still get past reCaptcha on my site :\ – Andy E Sep 01 '10 at 22:05
  • You could also look into implementing http://www.akismet.com on your site. But this is generally for comment spam. And remember, that the reCaptcha and the Honey Pot will not thwart human spammers. – Jim Sep 01 '10 at 22:12
  • 1
    akismet is good, but if possible, i'll love a way that dont rely on thirdy-part services – Strae Sep 02 '10 at 10:29
  • @Jim I am using Recaptcha for a long time, but it makes my site's score drop a lot. Any idea to deal with the re-captcha's resources? I try preload, prefetch and preconnect all didn't help. – Jornes Aug 18 '23 at 16:04