4

I am trying to use ftplib.FTP() with timeout option as some timeout value for a particular hostname. But i am experiencing weird behaviour. To test it i have written a very simple piece of code.

import ftplib
from ftplib import FTP
ftp = ftplib.FTP("google.com",timeout=2)

The API document says to enter timeout value in seconds, but it seems that it takes longer than that, for me it almost takes more than 8 secs. Can anybody please explain the behaviour.I am using python2.7

Tejendra
  • 1,874
  • 1
  • 20
  • 32
  • The timeout applies to the connection of the socket. Before it even starts this, python's socket.py (used by ftplib) does a DNS lookup for the address. Perhaps this is taking a long time? Out of curiosity, how long does it take you to run ```import socket; socket.getaddrinfo("google.com", 21)```. – rod Feb 10 '15 at 14:38
  • @rod..reply to this expression comes immediately on my system. – Tejendra Feb 11 '15 at 06:33
  • The only thing that I can see happen after ```socket.getaddrinfo()``` is a call to the underlying C code (which is somewhat platform specific). This makes use of the libc ```select()``` function, the timeout is simply passed through to that function. Either there's something else I've missed that might be introducing a delay, or the select function timeout isn't working properly on your OS's glibc/winsock. What OS are you using? (You can look at the ```internal_connect``` function here, by the way: https://hg.python.org/releasing/2.7.9/file/753a8f457ddc/Modules/socketmodule.c) – rod Feb 11 '15 at 10:32
  • @rod I am using debian-linux 7.8.0 – Tejendra Feb 11 '15 at 12:17
  • Well... I'm stumped. You could try debugging the underlying Python C code if you're really keen on finding the issue. Or you could try making a bug report. Or perhaps someone else has an idea? – rod Feb 11 '15 at 12:56

2 Answers2

4

ftplib.FTP invokes socket.create_connection(). According to the docs https://docs.python.org/2/library/socket.html#socket.create_connection

if host is a non-numeric hostname, it will try to resolve it for both AF_INET and AF_INET6, and then try to connect to all possible addresses in turn until a connection succeeds.

A quick check of google.com will show about a dozen (or more) depending on your region of the country. Your 2 second timeout is applied to each of the hosts.

If you want to limit total time to 2 seconds, do the lookup first and pass the numeric address to your ftplib.FTP call:

import socket, ftplib
host = socket.gethostbyname('google.com')
ftp = ftplib.FTP(host, timeout=2)
Ozgur Vatansever
  • 49,246
  • 17
  • 84
  • 119
user590028
  • 11,364
  • 3
  • 40
  • 57
  • 2
    don't use `gethostbyname()` (it does not support IPv6 name resolution), call `getaddrinfo()` instead. – jfs Feb 14 '15 at 11:11
  • 1
    I originally submitted my answer w/ getaddrinfo(), but the extra brackets required detracted from the cleanliness of the example. For production use however, getaddrinfo is definitely the way to go. – user590028 Feb 14 '15 at 11:18
0

From ftplib.FTP docs:

The optional timeout parameter specifies a timeout in seconds for blocking operations like the connection attempt

i.e., the timeout may limit individual socket operations but it says nothing about the duration of the FTP() call itself.

As @user590028 pointed out: FTP calls (indirectly) socket.create_connection() that may invoke several blocking operations in sequence and it may succeed if each operation takes less than timeout seconds even if all operations combined take longer.

If you want to enforce the total timeout, see Timeout on a Python function call.

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670