I have a requirement where in large size zipped files (size in GBs) are coming in a directory on a unix server (lets say server1) and I have to write application which will poll that directory and copy the files to another unix server (lets say server2) as they come . I have a way to know when one file is completely copied in a directory (using corresponding meta data file which will only come when copy operation of a single file is complete) . Since there are hundreds of files, we dont want to wait for all the files to be copied. Once files are copied to server2 , I have to do unzipping and some validations before I land up those files in my final repository.
Questions
- What would be the appropriate tech to use for this scenario,shell scripting or java or something else in terms of speed ?
- Since we will be doing the transfer operation file by file , how do we achieve parallelism (other than multithreading if we use java) ?
Any existing lib/package/tool available which can fit this scenario .