9

I'm using getaddrinfo to do DNS queries from C++ on Windows. I used to use the Windows API DnsQuery and that worked fine, but when adding IPv6 support to my software I switched to getaddrinfo. Since then, I've seen the following:

My problem is that some times getaddrinfo take very long time to complete. The typical response from getaddrinfo takes just a few milliseconds, but roughly 1 time out of 10000, it takes longer time, in some cases around 15 seconds but there's been several cases when it takes several minutes.

I've run Wireshark on the server and analyzed my applications debug logs and see the following:

  • I call the function getaddrinfo.
  • 15 seconds later, my machine queries the DNS server.
  • Some milliseconds later, I get the response from the DNS server.

The weird thing here is that the actual DNS query only takes a tenth of a second, but the time getaddrinfo actually executes is much longer.

The problem has been reported by many users, so it's not something specific to my machine.

So what does getaddrinfo do more than contact the DNS server?

Edit:

  • The problem has occurred with several addresses. If I try to reproduce the problem using these addresses, the problem does not occur.
  • I have done something stupid. Upon every DNS query, the etc/services is parsed. However, that doesn't explain a delay on several minutes. (thanks D.Shawley)

Edit 2

  • One type of DNS queries made by my software is anti-spam DNSBL queries. The log from one user showed me that the lookup for ip.address1.example.com always seemed to take exactly 2039 seconds, while the lookup for another.ip.address.example.com always took exactly 1324 seconds. The day after that, the lookups for those addresses were just fine. At first I thought that the DNS BL authors had put some kind of timeout on their side. But if this was the core problem, getaddrinfo should have timed out earlier?
Georg Fritzsche
  • 97,545
  • 26
  • 194
  • 236
Nitramk
  • 1,542
  • 6
  • 25
  • 42
  • Is it only queries for certain, specific addresses that are slow? – SimonJ Nov 22 '09 at 13:24
  • Try running something like FileMon and make sure that it is not doing something stupid like reading and parsing `c:\windows\system32\drivers\etc\services` and `c:\windows\system32\drivers\etc\hosts` every time that you call `getaddrinfo()`. – D.Shawley Nov 22 '09 at 13:42
  • 1
    It almost certainly would parse the hosts file at least on every call, but that shouldn't take more than a millisecond or two. – Michael Kohne Nov 22 '09 at 14:14
  • @D.Shawley. You're right. I do (or getaddrinfo) in fact parse etc\services every time I do a DNS query, which is of course very stupid. I don't think that alone explains the slowness though, unless Windows tries to parse a services file located on another server in the network or in Active Directory (I'm not sure if there's even such a possibility in Windows). :-\ – Nitramk Nov 22 '09 at 14:15
  • Is there anything else on the systems in question, like firewalls or AV software that one could think might be involved? I'm thinking that if all the 'problem' systems have the same firewall, then perhaps the firewall is doing something odd? – Michael Kohne Nov 22 '09 at 14:16
  • @Michael Kohne, I've tried two different firewall software since I started to get this problem. I've been in contact with users seeing the problem who tells me that they don't run any firewall at all. – Nitramk Nov 22 '09 at 14:25
  • @Nitamk, what version of Windows are you using: Vista, Windows 7, Windows Server 2008? – Craig Trader Nov 25 '09 at 15:32
  • @W. Craig Trader I'm seeing this in Windows Server 2003. – Nitramk Nov 25 '09 at 15:45
  • Are you querying for AAAA records or A records? – Craig Trader Nov 25 '09 at 16:42
  • @W. Craig Trader: Good point. I'm querying for both. When I used DnsQuery, I only did A record querying (apart from TXT, MX etc). In the new version, I do both A and AAAA querying. The reason I switched from DnsQuery to getaddrinfo in the first place was because getaddrinfo is more transparant when it comes to querying both IPv4 and IPv6. My own server where I see this only has a IPv6 loopback address, not a real address. – Nitramk Nov 26 '09 at 07:54
  • @Nitramk: Did you ever find a definitive solution to this? And I just noticed your profile picture. Where do you play go? I play on dragongoserver and kgs... – ErikE Feb 22 '10 at 18:04

1 Answers1

4

Windows has a local daemon that does DNS caching. Your call to getaddrinfo() is getting routed to that daemon, which presumably is checking its cache before submitting the query to your DNS server.

See Windows Knowledge Base article 318803 for more information on disabling the cache.

[Edited]

It sounds to me as though your Windows Server 2003 instance is not configured correctly for IPv6. Once the IPv6 lookups timeout, it will fall back to IPv4. Knowledge Base articles that might help include:

Unfortunately, I don't have access to any Windows Servers, so I can't test/replicate this myself.

Craig Trader
  • 15,507
  • 6
  • 37
  • 55
  • 2
    Well, that kind of answers my question. But the same cache was used by DnsQuery and I never saw the problem when using that function. My software is deployed in ~10 000 locations and it wasn't until I switched to getaddrinfo a lot of users started to report this issue. Also, it would seem absurd that a lookup in the local DNS cache would take 15 seconds. – Nitramk Nov 22 '09 at 12:50
  • One can check whether it is the cache or not by issuing the same command multiple times. I suspect it is not looking at the cache and that is part of the problem. The other problem is that it also looks for ipv6 addresses and those lookups are slow on certain setups for some reason. – highBandWidth Jul 07 '11 at 16:27