During some testing, one of our teams reported timeouts attempting to access a directory via FTP. The cause was a bug in their code which had caused millions of tiny files to be created.
From my understanding the reason for the timeout is the request asks for the directory's contents to be listed, and waits for a single response with all files.
If instead the server immediately started returning results as they were found (think: yield return
vs return
), this would stave off the timeout. Similarly, if there were some option to return paged data, that may give us a workaround.
Since FTP is request-response
, rather than request-response-response-...
I'm imagining the yield return
scenario is not possible; but some form of paging may be. That said, perhaps this would not give a solution since paging implies some form of sorting, which itself would incur an overhead scaling with the number of files.
NB: This is a question from curiosity; our real issue is resolved as I simply purged the directory (https://stackoverflow.com/a/6208144/361842) to resolve the issue. However, my thinking is if there was an option to drip feed results back, the number of items in the folder would cease to be a potential issue (so long as we're not sorting / filtering / etc the results before they're returned). We're using FileZilla Server
, and a .Net client (System.Net.FtpWebRequest
); but since this is theoretical I'm interested in generic answers more than those specific to our implementation.