14

I was given 10 new PCs, all (supposedly) with Windows 7 Pro freshly installed and nothing else done to them.

I have a program, coded in Delphi XE2, using Indy 10 components for the networking. I set the "connect timeout" and "read timeout" properties of my TIdTcpCleint to 500ms, set "resuse socket" to 'o/s dependant'" (I also tried a build with it set to No) and leave "use Nagle" (whatever that is set to True (I also tried with false).

Here's the problem: when I run the same .EXE on these PCs and test the case where I pull the network cable, my debug trace shows the connect attempt / connect timeout happening in the same second or the next second (with a granularity of 1 second) - but on others it is 20 or 21 seconds before I see the conenction timeout.

It would seem some of that the PCs are not totally "fresh install" as claimed, although I see no aps installed. Maybe some one installed somethign then removed it, maybe they tried to tweak performance.

Before I reinstall Windows on 10 PCs, can anyone suggest where to look? Does 20 (or 21) seconds ring a bell with regard to TCP Client connect timeout?

[update] I am attempting to connect directly to a specific IP Address, so I am not sure if @Nikolai suggestion to check DNS is relevant. Sorry for not mentioning this originally.

[upperdate] the program does not attempt to keep the socket open. It connects, sends some data & disconnects - repeatedly, for each new piece of data.

Mawg says reinstate Monica
  • 38,334
  • 103
  • 306
  • 551
  • 2
    Random guess - check how DNS works on those different sets of PCs. – Nikolai Fetissov Aug 22 '12 at 02:47
  • 3
    Run wireshark on the box and see what is happening – Adrian Cornish Aug 22 '12 at 02:54
  • +1 to both Now to Google & find out how to do those things. Thanks for tips – Mawg says reinstate Monica Aug 22 '12 at 03:07
  • 2
    Save ya a little time - http://www.wireshark.org/ – Adrian Cornish Aug 22 '12 at 03:09
  • 2
    Why do you pull the network cable? You could as well try to connect to an invalid IP address, the result should be the same! Pulling a cable does not automatically trigger a "reconnect", only the app code can do this. – mjn Aug 22 '12 at 04:41
  • +1 @mjn Just trying to simulate a real world scenario. There are 2 servers & the client sends the same data to each. I want to test if one is offline – Mawg says reinstate Monica Aug 22 '12 at 05:18
  • 2
    @Mawg IIRC "ReadTimeout" means how many ms between each byte read is the maximum time allowed, 500 is quite a lot, I'd go for something lower. –  Aug 22 '12 at 08:58
  • @Dorin Thanks (+1). I have no idea what values to set. There will be 10 clients and 2 servers attached to a hub, no other PCs, using static IP addresses, with no DNS. Max traffic is a string saying "HEART_BEAT" from each client to each server every 10 seconds. Any recommendations on connect & read timeout values? – Mawg says reinstate Monica Aug 23 '12 at 01:05
  • 1
    @Mawg I'd go for 50ms, that's ~500ms to get "HEART_BEAT", if it takes longer than that, you clearly have a problem. –  Aug 23 '12 at 02:03
  • 2
    A `ConnectTimeout` of 500 ms is pretty small. I usually use 5-10 seconds instead. The `ConnectTimeout` is only applied once the socket is actually connecting to the server, so any preceeding DNS lookups are not subject to the `ConnectTimeout` at all, only the OS's own timeouts, which may take a long time if DNS is not working correctly. – Remy Lebeau Aug 23 '12 at 03:02
  • +1 to both. @Remy, really - 5 seconds on such a simple network? 10 clients, 2 servers, one hub. I am connecting before every data packet and disconnecting afterwards. 5 seconds is half of my heartbeat timer (which I could extend, of course). I know nothing of such values. If you say so, I will use 5 seconds; it just sounds a lot. What about the read timeout? Thanks, both. – Mawg says reinstate Monica Aug 23 '12 at 23:30
  • 2
    If it only takes 500ms to connect, setting the `ConnectTimeout` to 5s would just be a safety cushion to account for network lag. Obviously `Connect()` won't wait the full 5s if it can exit sooner. For `ReadTimeout`, you'll have to tailor that to your network's speed and your app's needs. The `ReadTimeout` is set to infinite by default, which is how Indy is designed to be used and suits most needs. Setting a `ReadTimeout` is used when you need blocking operations to abort after awhile or account for system lag during error handling. I usually set `ReadTimeout` to 15s-30s if at all. – Remy Lebeau Aug 24 '12 at 01:09
  • +1 Then that is what I will do. Thanks a 1,000,000. Any idea what is causing a 20 second conenct t/o when 500ms is set? I plan to clone a working PC (500ms) to non-working, which ought to work, but I might just offer a bounty on this, out of interest – Mawg says reinstate Monica Aug 24 '12 at 01:24

2 Answers2

8

Sadly, this is working as intended. The connect did already timeout. Indy made the determination that the connect would fail in the 500 milliseconds that you asked it to. However, that does not guarantee the function will return.

After the connect times out, Indy spins down the connection to release all of its resources. It does this synchronously. This means that you wind up waiting for the underlying TCP operation to fail. This typically takes 20 seconds.

The solution is to call connect in a thread. Believe it or not, this is what Indy already does to implement the timeout. However, when it times out waiting for the thread, it tries to shut down the connection in the main thread. You need to defer that to a worker thread.

As for why it happens immediately on some systems and in 20 seconds on others, it depends on the precise networking configuration. For example, if IPv6 is enabled, the stack may attempt to use an IPv6-to-IPv4 connection, and that may not report down even if the physical interface is down. Immediate detection of connection impossibility is never guaranteed and you shouldn't rely on it.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • Aaargh! Sigh. Looks like it's time for me to learn how to thread. Presumably each thread creates a TCP clietn, sends the message (which it received as a paramter), gets a reply or catches an exception and then sends a windows message to the main form with the resule and destroys the TCP client? Hmm, can I get a unique thread Id for tracing porpoises? – Mawg says reinstate Monica Sep 05 '12 at 02:39
  • Thanks, @David I'm diving into the world of threading now ;-) – Mawg says reinstate Monica Sep 05 '12 at 12:01
  • The problem with putting `close` in a different thread is the lifetime management of the binding. If worker thread is closing the socket, you can't free the binding until it's done. – Roddy Oct 14 '14 at 09:53
  • Indy does use a worker thread to make a connection, and if the thread does not terminate before the timeout then Indy closes the socket and waits for the thread to terminate. Are you suggesting that Indy use *another* thread to close the socket? I suppose Indy could create a thread but not wait on it, just let it run in the background and terminate itself when finished. That would require `TIdSocketHandle` relinquishing ownership of the socket handle and give it to the thread to free when ready... – Remy Lebeau Feb 20 '16 at 02:46
  • ... But there is still the issue of Indy having to wait for the connect thread to terminate, and if its still waiting for the close thread to actually close the socket, this doesn't solve anything. – Remy Lebeau Feb 20 '16 at 02:47
1

I've had same problems with INDY in the past (while using D6, year 1998-2000). I changed the component to IP*Works. At that time it was an external component, but as far as I know it is included in XE2. Ip*Works is a bit hard to understand at the beginning but the way they approach to the communication structure is a lot different.

I think that it would be worth to give it a try.

Ali Avcı
  • 870
  • 5
  • 8