1

Without trying to sound like I am creating a movie plot - here is a scenario:

Website creates quotes for clients using a intricate classified quoting system.

Someone with malicious intent decides to create a bot and by using a wide range of parameters, spam the quotes and essentially copy your quoting model.

Question: How do you approach detection and prevention for this. The web requests are essentially legitimate.

Some of my thoughts were to monitor request frequency by IP address.

Do you approach this from server level e.g. IIS logs and settings or application/code level? Are there third party applications or plugins to monitor this kind of thing?

I fear this may cause a discussion, rather than a direct answer, but I am looking for some starting point and direction.

Davy Jones
  • 673
  • 7
  • 8
  • "monitor request frequency by IP address." would be the first step, but a smart leecher would use multiple addresses. The problem is essentially unsolvable. Forcing users to register with a hard-to-script process might help a little. – H H May 08 '15 at 15:06
  • Do the calls require any kind of login? If so, you could monitor request frequency by user. That would be better than using the IP. You could then build in a filter that prevents more than (e.g.) 50 quotes being requested by a single user on a single day. But, given enough time, even that wouldn't prevent someone reverse engineering your logic. – Ulric May 08 '15 at 15:07
  • Authentication would be a first step; vetting who gets a login to your application would be another. Do you actually need to allow users to sign up from your web site? Also, you should be able to do something like send an encrypted token to the client that would be required for further interaction. – John Saunders May 08 '15 at 17:15

2 Answers2

1

Is a human or a bot?

Start with CAPTCHA to eliminate bots from humans. Google invests in the technology and has some sophisticated methods.

IP is a signal to a fraud system, not a solution

You can try limiting on IP address, but that breaks down because there is no correlation between IP to Person. Most mobile phones and corporate users are NATed through a single public IP.

Know your baseline (and other signals)

  • Measure your typical number of quotes / minute.
  • Rank the quality/uniqueness/etc of input from the user. (Is the user's name "asdf 123")

Use the signals as an overall score

Once you have the signals, you can combine them to decide the riskiness score.

The system should not be only binary - block or allow

If you only have two options -- block or allow -- then you're either going to piss people off, or not be strict enough to prevent abuse. Build a middle ground.

For example, stop returning completed quotes, but instead, ask the user for their email address and emaile the complete quote result.

People are better at fraud detection than computers

Once the riskiness threshold is hit, have a person review the request, and if it seems legit, send release the email.

This is how paypal was able to create a fraud detection system at scale.

Jonathan
  • 5,736
  • 2
  • 24
  • 22
-1

Some thoughts by really no particular order:

  • You could do some machine learning and visitor flow analysis: regardless of the IP, that bot will have a set of programmed instructions - most likely stepping through all your available quotes. Of course, the offending actor could throw in some "random" bits of actions, so the bot didn't always have the same exact behavior.
  • At the same time, you can expect similar chunks of traffic to be going to the same IPs. If you'd monitor that, you could start seeing some patterns
  • You can also do some bot detection with Javascript, e.g. were there any mouse clicks, if so where? Were the x,y different than (0,0)?
Fisher
  • 85
  • 3