101
How do we get rid of these spambots on our site?

Every site falls victim to spambots at some point. How you handle it can effect your customers, and most solutions can discourage some people from filling out your forms.

That's where the honeypot technique comes in. It allows you to ignore spambots without forcing your users to fill out a captcha or jump through other hoops to fill out your form.

This post is purely to help others implement a honeypot trap on their website forms.


Update:

Since implementing the below honeypot on all of my client's websites, we have successfully blocked 99.5% (thousands of submissions) of all our spam. That is without using the techniques mentioned in the "advanced" section, which will be implemented soon.

Nicholas Summers
  • 4,444
  • 4
  • 19
  • 35

4 Answers4

170

Concept

By adding a invisible field to your forms that only spambots can see, you can trick them into revealing that they are spambots and not actual end-users.

HTML

<input type="checkbox" name="contact_me_by_fax_only" value="1" style="display:none !important" tabindex="-1" autocomplete="off">

Here we have a simple checkbox that:

  • Is hidden with CSS.
  • Has an obscure but obviously fake name.
  • Has a default value equivalent 0.
  • Can't be filled by auto-complete
  • Can't be navigated to via the Tab key. (See tabindex)

Server-Side

On the server side we want to check to see if the value exists and has a value other than 0, and if so handle it appropriately. This includes logging the attempt and all the submitted fields.

In PHP it might look something like this:

$honeypot = FALSE;
if (!empty($_REQUEST['contact_me_by_fax_only']) && (bool) $_REQUEST['contact_me_by_fax_only'] == TRUE) {
    $honeypot = TRUE;
    log_spambot($_REQUEST);
    # treat as spambot
} else {
    # process as normal
}

Fallback

This is where the log comes in. In the event that somehow one of your users ends up being marked as spam, your log will help you recover any lost information. It will also allow you to study any bots running on you site, should they be modified in the future to circumvent your honeypot.

Reporting

Many services allow you to report known spambot IPs via an API or by uploading a list. (Such as CloudFlare) Please help make the internet a safer place by reporting all the spambots and spam IPs you find.

Advanced

If you really need to crack down on a more advanced spambot, there are some additional things you can do:

  • Hide honeypot field purely with JS instead of plain CSS
  • Use realistic form input names that you don't actually use. (such as "phone" or "website")
  • Include form validation in honeypot algorithm. (most end-user will only get 1 or 2 fields wrong; spambots will typically get most of the fields wrong)
  • Use a service like CloudFlare that automatically blocks known spam IPs
  • Have form timeouts, and prevent instant posting. (forms submitted in under 3 seconds of the page loading are typically spam)
  • Prevent any IP from posting more than once a second.
  • For more ideas look here: How to create a "Nuclear" honeypot to catch form spammers
Community
  • 1
  • 1
Nicholas Summers
  • 4,444
  • 4
  • 19
  • 35
  • 22
    "Has a default value equivalent 0", but the example has value="1" ? Is that intended? – David d C e Freitas Jun 27 '17 at 07:19
  • can't the bots just check if the field is visible? – edank Feb 05 '18 at 22:31
  • 2
    @edank With that comes some limitations, for example if they only look at fields that are not `display:none`, `visibility: hidden`, or `opacity: 0` they won't find any of the forms that are not on screen when the page is initially rendered (which is **very** common), not to mention most bots don't even fetch css/js files (why would they when they only really care about HTML `
    ` elements). So while they **could** try to detect a honeypot's css, it's simply not worth it. There is actually a ton of complications with detecting what's "visible" but what I just said is the most common reason.
    – Nicholas Summers Feb 06 '18 at 02:08
  • 2
    Since you've already hidden it with display:none (or CSS), doesn't tabindex=-1 become redundant? I.e. hidden fields can't be tabbed to anyway? I worry that tabindex=-1 gives bots a nice easy way to find what your honeypot fields are! – Aaron Aug 10 '18 at 00:38
  • @Aaron It is but: 1) There are other ways to hide an input (such as in the overflow or off page) and it's important that a normal browser never navigates to it. 2) Any invalid value should work, not just -1. So you could use "hippopotamus" or "false". – Nicholas Summers Aug 10 '18 at 00:44
  • 9
    @DaviddCeFreitas -- the checkbox value is "1". But it will only submit that value if the user checks the box. The bots will check the box and the php will read "1" as boolean TRUE and detect the bot. – tandy Oct 09 '18 at 20:36
  • WARNING: this will actually not work if you submit your form via AJAX or some async request library, and that's because if you use `document.querySelector('[type="checkbox"]').value`, it will spit out the "1" value instead of 0, so instead you should use the `checked` property, which should say `false` on page load, to check if a bot marked the checkbox – OzzyTheGiant Mar 26 '20 at 16:33
  • 2
    Changing the css to display:none could be considered as not future proof - potential future bots would be able to detect the css inline. Perhaps a better alternative is to implement a class of a non-obvious class name and use a style sheet. A spam bot would have to download the correct sheet, and 'guess' which class name applied to a specific form. Alternatively using javascript to 'hide' the input would make this more future proof - see yodarunamok's implementation. – Danny F Jun 27 '20 at 18:28
  • 1
    @DannyF Please read the entire answer and comments before making suggestions to improve it. This has all been covered already. Furthermore, this answer starts out as basic as possible and then suggests improvements. This is done to keep these barrier-to-entry as low as possible while still giving as much information as possible. – Nicholas Summers Jun 27 '20 at 18:44
  • I read the entire answer and the comments. Yes, I did say that using js to future proof the implementation was already spelled out. However, no where on the entire page was there any mention of using a style sheet to further obfuscate the honeypot implementation. I would hardly think that a css style sheet is particularly hard to implement and makes the code more robust. – Danny F Jun 27 '20 at 19:03
  • The autocomplete attribute is not valid on checkboxes W3C validator throws error: 'Attribute “autocomplete” is only allowed when the input type is “color”, “date”, “datetime-local”, “email”, “hidden”, “month”, “number”, “password”, “range”, “search”, “tel”, “text”, “time”, “url”, or “week”.' – J Grover Nov 01 '20 at 19:57
  • @jgrover W3C is a useful resource, it isn't the best measuring stick. For that we need to look at the browsers themselves. Mozilla provides good documentation for them and so does caniuse. Actually, most of this answer is based on Google's own docs of how autocomplete should work. – Nicholas Summers Nov 01 '20 at 22:58
  • 1
    It's 2022 is this still the goto solution, if you don't want to use a reCaptcha (who wants at all ^^)? – moritzgvt Aug 09 '22 at 13:23
  • @moritzgvt - I would say it is still a very valid solution and I don't know of an alternative; However, I do know that most spam scrapper that I have seen haven't needed to change much since most sites don't implement any protections at all. – Nicholas Summers Aug 10 '22 at 20:37
21

We found that a slight (though simple) variation on the suggestions here made a huge difference in the effectiveness of our contact form honeypot. In short, change the hidden field to a text input, and make the bot think it's a password. Something like this:

<input type="text" name="a_password" style="display:none !important" tabindex="-1" autocomplete="off">

You'll note that this mock-password input keeps to the same basic guidelines as the checkbox example. And yes, a text input (as opposed to an actual password input) seems to work just fine.

This apparently minor change resulted in a drastic drop in spam for us.

yodarunamok
  • 441
  • 5
  • 9
  • Could you elaborate on why this is more effective? The expectation is that they will still fill it out? Or is it that they will think it is a password field and leave with no submission? Why not use a password field vs text? – deflime Aug 26 '20 at 17:04
  • 1
    @deflime The point of the field is for bots to fill it out, thus informing us that the request is spam. The idea of a text field honeypot is to make it appealing to a bot (where a password field would be less so.) That said, because I don't know how the bots are coded, I can't be specific about why it works. Should you do some testing of your own I'd be interested in your findings. – yodarunamok Aug 26 '20 at 21:47
  • So far, running three honeypots, one being a type="password" version, the bots will fill either all 3 or none at all... so they seem equal at this point. No false positives though. – deflime Aug 28 '20 at 06:51
13

One suggestion to really force the no-autocompletion :
change autocomplete="off" by autocomplete="nope" OR autocomplete="false"

Since the given value is not a valid one (values for autocomplete are only on or off), the browser will stop trying to fill the field.

For more details, How to Turn Off Form Autocompletion.

Hope this helps.

SYA :)

LebCit
  • 618
  • 4
  • 13
  • 1
    It looks like you want `autocomplete="new-password"` nowadays. Password managers have a habit of ignoring `autocomplete=off` and I've had a lot of trouble with Safari on Mac – Mark Jerzykowski Dec 15 '22 at 10:03
3

If you are using Ruby on Rails, you can try invisible_captcha gem. A solution based on this honeypot technique.

It works pretty well! At least for small/medium sites... I'm using it in production, for years, in several Rails apps with very good results (we hardly receive spam since its implementation in "contact" forms, sign-up, etc).

It also provides some extras (already listed in https://stackoverflow.com/a/36227377/3033649):

  • time-sensitive submissions
  • IP based spinner validation

Basic usage

In your form:

<%= form_for(@user) %>
  <%= invisible_captcha %>
  ...
<% end %>

In your controller:

class UsersController < ApplicationController
  invisible_captcha only: [:create]
  ...
end

And you're done! Hope it helps!

markets
  • 6,709
  • 1
  • 38
  • 48