3

I'm trying to do multithread uploads, but get errors. I guessed that maybe it's impossible to use multithreads with ftplib?

Here comes my code:

    class myThread (threading.Thread):
    def __init__(self, threadID, src, counter, image_name):
        self.threadID = threadID
        self.src = src
        self.counter = counter
        self.image_name = image_name
        threading.Thread.__init__(self)
    def run(self):
        uploadFile(self.src, self.image_name)

def uploadFile(src, image_name):
    f = open(src, "rb")            
    ftp.storbinary('STOR ' + image_name, f)
    f.close()

ftp = FTP('host')   # connect to host, default port
ftp.login()               # user anonymous, passwd anonymous@   
dirname = "/home/folder/"
i = 1   
threads = []

for image in os.listdir(dirname):
    if os.path.isfile(dirname + image):
        thread = myThread(i , dirname + image, i, image )   
        thread.start()
        threads.append( thread )        
        i += 1  

for t in threads:
    t.join()

Get bunch of ftplib errors like

raise error_reply, resp error_reply: 200 Type set to I

If I try to upload one by one, everything works fine

Arty
  • 5,923
  • 9
  • 39
  • 44
  • How would this work even if ftplib did have multithreaded support? Each of your threads attempts to upload all of the same file – matt b Mar 31 '10 at 01:25
  • why is it the same? Works correct if I just call function in the same 'for' cycle without threading. It passes all files from the folder – Arty Mar 31 '10 at 01:28
  • oops, misread the file opening code. Either way I think it's pretty safe to assume that the library does not provide a thread-safe or concurrent ftp session. – matt b Mar 31 '10 at 01:30
  • anyway, ftplib doesn't have multithreaded support, does it? – Arty Mar 31 '10 at 01:30
  • Too bad. It would take ages to upload big amount of files even with good connection – Arty Mar 31 '10 at 01:33

2 Answers2

5

Have you tried to put the connection code inside the thread?

In other words, make each thread do their own separate connection with FTP.host() and FTP.login(). The server may not like multiple uploads at the same time on a single connection, because it may be parsing commands one at a time and can't handle a second upload or "STOR" command. But if you can do multiple connections from the same IP address, you'll have separate session on which to issue the 'STOR' command.

Here's an example:

    class myThread (threading.Thread):
        def __init__(self, threadID, src, counter, image_name):
             ###############
             #Add ftp connection here!
             self.ftp = FTP('host')   # connect to host, default port
             self.ftp.login()               # user anonymous, passwd anonymous@   
             ################
             self.threadID = threadID
             self.src = src
             self.counter = counter
             self.image_name = image_name
             threading.Thread.__init__(self)
        def run(self):
             uploadFile(self.src, self.image_name)

    def uploadFile(src, image_name):
          f = open(src, "rb")            
          self.ftp.storbinary('STOR ' + image_name, f)
          f.close()

     dirname = "/home/folder/"
     i = 1   
     threads = []

     for image in os.listdir(dirname):
          if os.path.isfile(dirname + image):
             thread = myThread(i , dirname + image, i, image )   
             thread.start()
             threads.append( thread )        
             i += 1  

      for t in threads:
          t.join()

See if that behaves better.

P.S. Not sure if all my tabs are aligned.

Jay Atkinson
  • 3,279
  • 2
  • 27
  • 41
  • 1
    Thanks, this works, although this method works slower than uploading without multithreading. So probably I should either find another lib or put up with one-thread uploading – Arty Mar 31 '10 at 02:32
  • Curios about why this is slower. Any insights? – Matt Nov 29 '18 at 13:37
  • 1
    @Matt it is slower because of multiple connections to ftp server – glennmark Sep 22 '21 at 14:40
0

I ended up using the Semaphore to limit the usage of FTP connection only to one thread at a time. It is faster to share connection rather than creating connection for each thread. In your case it would look like:

from threading import Semaphore

ftp_semaphore = Semaphore(1)  # limiting connection to only one thread

class myThread (threading.Thread):
    def __init__(self, threadID, src, counter, image_name):
        self.threadID = threadID
        self.src = src
        self.counter = counter
        self.image_name = image_name
        threading.Thread.__init__(self)
    def run(self):
        uploadFile(self.src, self.image_name)

def uploadFile(src, image_name):
    f = open(src, "rb")
    with ftp_semaphore:     
        ftp.storbinary('STOR ' + image_name, f)
        f.close()

ftp = FTP('host')   # connect to host, default port
ftp.login()               # user anonymous, passwd anonymous@   
dirname = "/home/folder/"


i = 1   
threads = []

for image in os.listdir(dirname):
    if os.path.isfile(dirname + image):
        thread = myThread(i , dirname + image, i, image )   
        thread.start()
        threads.append( thread )        
        i += 1  

for t in threads:
    t.join()
Victor Di
  • 988
  • 10
  • 16