1

I'm using the code below in WebHarvest configuration file to define timeout for http element in WebHarvest (Webharvest uses Jakarta HttpClient).
But while I'm setting it to 20000 it takes about 40-50 seconds until timeout get reached!
And when I set it to 30000 timeout never reaches (At least in 2 minutes that I waited)!!
I only need to limit response waiting time.

<var-def name="WTimeOut">20000</var-def>
<script language="javascript"> 
       var tmot=WTimeOut.toString(); 
       http.client.params.soTimeout = tmot; 
       http.client.params.connectionManagerTimeout = tmot;
       http.client.httpConnectionManager.params.connectionTimeout = tmot; 
</script> 

I also tried to do it via java code itself o HttpClient by this code:

HttpClient whClient = scraper.getHttpClientManager().getHttpClient();
whClient.getParams().setParameter("http.connection-manager.timeout", (long)20000);
whClient.getParams().setParameter("http.socket.timeout",(int)20000);

But I got the same results!
As this :

SO_TIMEOUT will kick in only when there is an inactivity on the HTTP connection

So what can I do to set a time limit on waiting for response?

Thanks

Community
  • 1
  • 1
Ariyan
  • 14,760
  • 31
  • 112
  • 175

1 Answers1

1

http.socket.timeout sets the time to wait between two consecutive packets. So if there is data coming in very slowly, but still fast enough not to trigger the timeout, the connection will not be severed.

You can also set http.connection.timeout to limit the amount of time to wait until a connection is established.

However, there is no built-in way to set a hard time limit on the length of the entire request.

Lauri Piispanen
  • 2,067
  • 12
  • 20