I'm trying to build a script which shows me a list of IP's that are bots/spiders.
I wrote a script which imports the access log of Apache to a mysql db so I can try to manage it with php and mysql.
I've noticed a lot of bots have regular intervals, they send out a request every 2 or 3 seconds. Is there an easy way of showing these patterns with a query or php script? Or, even harder I think, is there an algorithm that can recognise these bots / spiders.
DB:
CREATE TABLE IF NOT EXISTS `access_log` (
`IP` varchar(16) NOT NULL,
`datetime` datetime NOT NULL,
`method` varchar(255) NOT NULL,
`status` varchar(255) NOT NULL,
`referrer` varchar(255) NOT NULL,
`agent` varchar(255) NOT NULL,
`site` smallint(6) NOT NULL
);