2

I am pretty much stuck on a problem from last few days. I have a file while is located on a remote server can be access by using userId and password. Well no problem in accessing.

Problem is I have around 150 of them. and each of them is of variable size minimum is 2 MB and max is 3 MB.

I have to read them one by one and read last row/line data from them. I am doing it in my current code.

The main problem is it is taking too much time since it is reading files from top to bottom.

       public bool TEst(string ControlId, string FileName, long offset)
    {
        // The serverUri parameter should use the ftp:// scheme. 
        // It identifies the server file that is to be downloaded 
        // Example: ftp://contoso.com/someFile.txt. 

        // The fileName parameter identifies the local file. 
        //The serverUri parameter identifies the remote file. 
        // The offset parameter specifies where in the server file to start reading data. 
        Uri serverUri;
        String ftpserver = "ftp://xxx.xxx.xx.xxx/"+FileName;
        serverUri = new Uri(ftpserver);


        if (serverUri.Scheme != Uri.UriSchemeFtp)
        {
            return false;
        }
        // Get the object used to communicate with the server.
        FtpWebRequest request = (FtpWebRequest)WebRequest.Create(serverUri);
        request.Credentials = new NetworkCredential("test", "test");

        request.Method = WebRequestMethods.Ftp.DownloadFile;
       
        //request.Method = WebRequestMethods.Ftp.DownloadFile;
        
        request.ContentOffset = offset;
        FtpWebResponse response = null;
        try
        {
            response = (FtpWebResponse)request.GetResponse();
           // long Size = response.ContentLength;
           
        }
        catch (WebException e)
        {
            Console.WriteLine(e.Status);
            Console.WriteLine(e.Message);
            return false;
        }

       
        // Get the data stream from the response.
        Stream newFile = response.GetResponseStream();
        // Use a StreamReader to simplify reading the response data.
        StreamReader reader = new StreamReader(newFile);
        string newFileData = reader.ReadToEnd();
        // Append the response data to the local file 
        // using a StreamWriter.


        string[] parser = newFileData.Split('\t');

        string strID = parser[parser.Length - 5];
        string strName = parser[parser.Length - 3];
        string strStatus = parser[parser.Length-1];

        if (strStatus.Trim().ToLower() != "suspect")
        {
            HtmlTableCell control = (HtmlTableCell)this.FindControl(ControlId);
            control.InnerHtml = strName.Split('.')[0];
        }
        else
        {
            HtmlTableCell control = (HtmlTableCell)this.FindControl(ControlId);
            control.InnerHtml = "S";
        }


        // Display the status description. 

        // Cleanup.
      
        reader.Close();
        response.Close();
        //Console.WriteLine("Download restart - status: {0}", response.StatusDescription);
        return true;
    }

Threading:

  protected void Page_Load(object sender, EventArgs e)
  {
     

     new Task(()=>this.TEst("controlid1", "file1.tsv", 261454)).Start();
     new Task(()=>this.TEst1("controlid2", "file2.tsv", 261454)).Start();
  }
peterh
  • 11,875
  • 18
  • 85
  • 108
James
  • 33
  • 1
  • 9
  • 1
    Just an idea: Some ftp servers have a resume download function, did you try exploiting the resume function to fake a "seek" and then get the last 'x' kb of data? http://msdn.microsoft.com/en-us/library/system.net.ftpwebrequest.contentoffset.aspx – Jeremy D Sep 21 '13 at 18:19
  • your solution is good +1 for it, but while updating the content in the file at remote server it increases its size. so i cannot fix offset. – James Sep 23 '13 at 11:53
  • I'm pretty sure that you can list folder content with sizes of each file – Jeremy D Sep 23 '13 at 13:31
  • ya i can but for that i have to write two methods one to get file size and then other to set offset based on filesize. – James Sep 24 '13 at 05:59
  • I had used you method with threading but it is not able to write content in control. it is throwing an exception of file not found. See my edit for threading code. – James Sep 24 '13 at 08:44
  • See [this post](http://stackoverflow.com/questions/3507770/write-to-a-file-from-multiple-threads-asynchronously-c-sharp) for help – TombMedia Sep 24 '13 at 12:43

4 Answers4

2

FTP is not capable of seeking a file to read only the last few lines. Reference: FTP Commands You'll have to coordinate with the developers and owners of the remote ftp server and ask them make an additional file containing the data you need.

Example Ask owners of remote ftp server to create for each of the files a [filename]_lastrow file that contains the last row of the files. Your program would then operate on the [filename]_lastrow files. You'll probably be pleasantly surprised with an accommodating answer of "Ok we can do that for you"

If the ftp server can't be changed ask for a database connection.

danny117
  • 5,581
  • 1
  • 26
  • 35
  • file is being updated every 45 secs. And i had already asked for this. still no luck. – James Sep 21 '13 at 15:10
  • ftp is older than the interweb. It just doesn't support seek. 3.4 gbits of constant data would choke my home office connection. – danny117 Sep 22 '13 at 22:19
1

You can also download all your files in parallel and start popping them into a queue for parsing when they are done rather than doing this process synchronously. If the ftp server can handle more connections, use as many as would be reasonable for the scenario. Parsing can be done in parallel too.

More reading: System.Threading.Tasks

It's kinda buried, but I placed a comment in your original answer. This SO question leads to this blog post which has some awesome code you can draw from.

Community
  • 1
  • 1
TombMedia
  • 1,962
  • 2
  • 22
  • 27
  • I had used threading in the function. Idea is pretty good. It improves performance. but data is not appearing in output. i mean i cannot set innerhtml of control with it. any idea? – James Sep 24 '13 at 08:24
  • It's going to be more complicated than that. You won't be able to write to the main UI thread from a background thread. Async/await is probably the easiest way to do this, but depending on concurrency you'll need to download/process in the background and return your data back to the main aspx page. Threading is a whole other can of worms. – TombMedia Sep 24 '13 at 12:36
  • Probably easiest is to make your own file in the background with those last lines like others have suggested your provider do for you. Create the file by downloading FTP in parallel and [then append your data to a master file](http://msdn.microsoft.com/en-us/library/kztecsys.aspx). Refresh your page to check some flag that things are done and when they are use your new master file on the main UI thread. Refresh on a 2s interval and your user even gets an update :) – TombMedia Sep 24 '13 at 12:40
  • you may be right. but all the time is being consumed in getting data from files. Threading is creating problem it is giving file not found error. reading one by one will be the same. suggest me what can i do now? – James Sep 27 '13 at 11:41
0

Rather than your while loop you can skip directly to the end of the Stream by using Seek. You then want to work your way backwards though the stream until you find the first new line variable. This post should give you everything your need to know.

Get last 10 lines of very large text file > 10GB

Community
  • 1
  • 1
pingoo
  • 2,074
  • 14
  • 17
  • Rather than use StreamReader work with the response stream its self. I admit I've never tried to seek a response stream but having read SO it looks like its possible? – pingoo Sep 18 '13 at 16:02
  • can you suggest me on this. – James Sep 20 '13 at 06:56
0

FtpWebRequest includes the ContentOffset property. Find/choose a way to keep the offset of the last line (locally or remotely - ie by uploading a 4 byte file to ftp). This is the fastest way to do it and the most optimal for network traffic.

More information about FtpWebRequest can be found at MSDN

  • i have already done that...now i am stuck at accessing multiple files from FTP using threading see my updated code for details. – James Sep 28 '13 at 06:31
  • Changing your question to a new direction??? You updated code is not complete, you call Test passing 3 parameters and the Test takes only two. If the 3rd parameter is the offset of the file, then keep in mind that this has to be per file. As I wrote in my answer you may upload a second file for the last line offset (ie for file1.tsv the file1.llo). This way you will first read the file1.llo and then you will read file1.tsv starting from the offset defined in file1.llo. This does not cause any problem to threads the way you use them. –  Sep 28 '13 at 17:06
  • James, this seems a partial update. I don't see how you keep the offset per file (locally or on ftp with a supplementary file). Beyond that do you still have problems with multiple threads? –  Oct 03 '13 at 08:51