I've got a fairly new website (~3 weeks old) running on Tomcat w/so far pretty low numbers of visitors.
For the last week I've noticed 1,000+ active sessions, and checking Tomcat's localhost_access* logs show that the overwhelming majority are coming from IPs in this range: 119.63.196.* which all look to belong to Baidu Japan.
Here's a small example from the logs of them hitting the front page. 119.63.196.107 - - [24/Aug/2011:07:02:46 +0000] "GET /;jsessionid=94085F76780ACFD96C8109A29446288D HTTP/1.1" 200 10311 119.63.196.44 - - [24/Aug/2011:07:03:21 +0000] "GET /;jsessionid=943133C77BB1756CF11592115BA81725 HTTP/1.1" 200 10333 119.63.196.39 - - [24/Aug/2011:07:03:56 +0000] "GET /;jsessionid=9B4384BDECF540C8628467F7AB4AB463 HTTP/1.1" 200 10311 119.63.196.19 - - [24/Aug/2011:07:04:31 +0000] "GET /;jsessionid=A0B555C3A18377D993B97D4491DD1012 HTTP/1.1" 200 10311 119.63.196.45 - - [24/Aug/2011:07:05:10 +0000] "GET /;jsessionid=A3782FA61558BF11C4D5AC4F3DD1EC86 HTTP/1.1" 200 10311 119.63.196.23 - - [24/Aug/2011:07:05:53 +0000] "GET /;jsessionid=A3AF84EF13F21492EB47FAB001A1C2E5 HTTP/1.1" 200 10311 119.63.196.120 - - [24/Aug/2011:07:06:31 +0000] "GET /;jsessionid=A7C490CEC2C7F2969772AC4050C6D761 HTTP/1.1" 200 10311 119.63.196.108 - - [24/Aug/2011:07:07:07 +0000] "GET /;jsessionid=A7F769D354CB37E99843292D650D6367 HTTP/1.1" 200 10311
No one individual IP is clobbering the site, but the collective requests from this IP range are racking up active sessions. And they seem to do it in somewhat of a coordinated fashion as one page at a time will get targeted and receive ~30 hits by ~30 different in the 119.63.196.* IP range over a 20 minute period. Then it'll move on to another page... and this is going on pretty much all day and racking up Tomcat sessions.
I do have inactive session timeout set pretty high (720 minutes), and maybe I need to bring that number down a lot. Maybe Baidu Japan is doing frequent checks because it thinks the page has changed due to a change in the link (i.e., the jsessionid is always different)?
Thanks for reading. I welcome any/all suggestions!
Eric