1

I need to transfer webserver-log-like-files containing periodically from windows production servers in the US to linux servers here in India. The files are ~4 MB in size each and I get about 1 file per minute. I can take about 5 mins lag between the files getting written in windows and them being available in the linux machines. I am a bit confused between the various options here as I am quite inexperienced in such design:

  1. I am thinking of writing a service in C#.NET which will periodically archive, compress and send them over to the linux machines. These files are pretty compressible. WinRAR can convert 32 MB of these files into a 1.2 MB archive. So that should solve the network transfer speed issue. But then how exactly do I transfer files to linux? I could mount linux drive on windows server using samba, or should I create an ftp server, or send the file serialized as a POST request. Which one would be good? Also, I have to minimize the load on the windows server.

  2. Mount the windows drive on linux instead. I could use the mount command or I could use samba here (What are the pros and cons of these two?). I can then write the compressing and copying part in linux itself.

I don't trust the internet connection to be very stable, so there should be a good retry mechanism and failure protection too. What are the potential gotchas in these situations, and other points that I must be worried about?

Thanks, Hari

Hari Menon
  • 33,649
  • 14
  • 85
  • 108

2 Answers2

3

RAR is bad. Stick to 7zip or bzip2. Transfer it using ssh, probably with rsync since it can be link-failure-tolerant.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • With the 7z format, you can actually specify the algorithm used. PPMd is a very fast and effective algorithm for compressing plain text files. This is ideal for large collections of log files. – Dean Taylor Nov 25 '10 at 17:29
  • 1
    RAR is generally Linux unfriendly, and since you know the content type you can pick a more-efficient specific algorithm (which unfortunately is not gzip/deflate). – Ignacio Vazquez-Abrams Nov 25 '10 at 17:38
  • @Dean.. Thanks for the tip :). But our windows code is all in C# and they supprt GZip natively so I'll use gz if I go with option 1. For option 2, I'll use perl fo create archive. In that case I'll use 7z. – Hari Menon Nov 25 '10 at 17:39
  • 1
    Related: http://stackoverflow.com/questions/222030/how-do-i-create-7-zip-archives-with-net – Ignacio Vazquez-Abrams Nov 25 '10 at 17:41
  • Thanks for the link Ignacio... ok, I can use 7z in .NET with the SDK. But what about my main question? Is there any clear cut winner between options 1 and 2? :$ – Hari Menon Nov 25 '10 at 17:51
  • 1
    No, they're both terrible ideas. http://stackoverflow.com/questions/2253259/native-net-version-of-rsync-for-commercial-application-available – Ignacio Vazquez-Abrams Nov 25 '10 at 17:54
  • FYI: 916MB of SQL backups compressed gzip=7.41MB 7z(PPMd)=0.26MB - if over the wire speed is your issue consider how you are compressing things. – Dean Taylor Nov 25 '10 at 18:04
  • Wow! That's some compression! But even if I use rsync, I'll have to do some kind of mounting, right? – Hari Menon Nov 25 '10 at 18:17
  • Nope. rsync is perfectly capable of running over SSH. – Ignacio Vazquez-Abrams Nov 25 '10 at 18:22
2

WinSCP can help you transfer files from Windows to Linux in batch with script. Then configure Windows Task Scheduler to run the script periodically.

I learnt from this post step by step: https://techglimpse.com/batch-script-automate-file-transfer-winscp/

Ray.H
  • 43
  • 6