1

There is a site\resource that offers some general statistic information as well as an interface to search facilities. This search operations are costly, so I want to restrict frequent and continuous (i.e. automatic) search requests (from people, not from search engines).

I believe there are many existing techniques and frameworks that perform some intelligence grabbing protection, so I don't have to reinvent a wheel. I'm using Python and Apache through mod_wsgi.

I am aware of mod_evasive (will try to use it), but I'm also interested in any other techniques.

Evhz
  • 8,852
  • 9
  • 51
  • 69
Roman Bodnarchuk
  • 29,461
  • 12
  • 59
  • 75

2 Answers2

1

If someone's hunting exactly your website and data there 's really worthy - nothing will stop the smart enough attacker in this case.

Though there are some things worth trying:

  • Keep counters of search usage from specific IPs and User-Agents. Block them when some minutely/hourly/daily thresholds are reached.
  • Use blacklists of potentially harmful IPs or threat levels (for example you can use Cloudflare API for that)
  • Cache the frequent search results to make them less costly
  • It's probably a bit crazy, but you can render that statistics on images or via flash/java applets - it will make them much more challenging to grab
  • A bit similar to previous one: use some tricky API to access search results, for example it can be ProtocolBuffers over WebSockets. So someone will probably need a full-blown browser to grab that or at least have to build some trickery around node.js. Downside - you'll lose legitimate clients using old browsers.
Ivan Blinkov
  • 2,466
  • 15
  • 17
0

You could try a robots.txt file. I believe you just put it at the root of your application, but that website should have more details. The Disallow syntax is what you're looking for.

Of course, not all robots respect it, but they all should. All the big companies (Google, Yahoo, etc.) will.

You may also be interested in this question about disallowing dynamic URLs.

Community
  • 1
  • 1
Peter Downs
  • 482
  • 1
  • 4
  • 14