0

We have one zip file which may contains multiple PDFs/JPG/TIFF files.

We have a logic by which we are moving this large zip file from one folder to another folder (may be on some other server).

In this process, we are breaking the zip into chunks of 1 MB and then moving those chunks one by one.

Due to some unknown reason, sometimes, one or more chunks gets corrupted in this process and few files are getting corrupted.

When unzip (programmatically or manually), we get errors like 'Headers Error'.

Here is the code we are using for moving the zip file:

while (inputZip.Length > nAmountWritten)
{
    buffer = reader.ReadBytes(10 * 1024 * 1024);
    bw.Write(buffer); //Binary Writer for destination
    nAmountWritten += buffer.Length;
}

What approach can I take to avoid corrupted chunk writing on destination?

Marc.2377
  • 7,807
  • 7
  • 51
  • 95
Arpit Gupta
  • 1,209
  • 1
  • 22
  • 39
  • Why not use `File.Move` ? – TheGeneral Jul 19 '19 at 06:41
  • Calculate the checksum/hash of the data to transmit, transmit data and checksum/hash, recalculate checksum/hash of data upon receiving, compare received with recalculated checksum/hash, if different request retransmit. – ckuri Jul 19 '19 at 06:43
  • @TheGeneral Because the zip is of huge size. Sometimes is of more than 5GB. – Arpit Gupta Jul 19 '19 at 06:43
  • @ckuri no program maintained at receiving side. Its just moving. So how can i calculate hash at receiving end? – Arpit Gupta Jul 19 '19 at 06:45
  • "*we are moving this large zip file from one folder to another folder*" - no, you are not - that's a very misleading statement. What you are doing transfering bytes over the network the manual way. No surprise it's not working reliably - it's quite hard to get it right. – Marc.2377 Jul 19 '19 at 06:47
  • 1
    What the advantage over a calling an process like robust file copy ? – xdtTransform Jul 19 '19 at 06:48
  • @ArpitGupta Then reread the data you have just written. Although if your blocksize is just 1MB you could forego the hashing and just directly compare the arrays of the data to be written and the reread data. – ckuri Jul 19 '19 at 06:49
  • @Marc.2377 - See, the scenario is, We are getting large files from client in form of zip. We are getting that at one of our servers. At that time we are using hash checking for proper file receiving. Now, From that server, We need to move that zip to another available server for further processing. In that process, we don't have any program at those servers to check if the file was proper. Is there any way to deal with this issue and to get the file transfer done in correct and proper way? – Arpit Gupta Jul 19 '19 at 06:51
  • 1
    Are you multi threading this by any chance ? – TheGeneral Jul 19 '19 at 06:52
  • 1
    @ArpitGupta still, why not use `File.Move`? I don't see why the reason you gave for that question is any more valid than it is for the process you are currently employing. I mean, if `File.Move` fails because the file is very large, your strategy of transferring in chunks of 1MB without checking what chunks were successfully transferred simply **cannot** be any more reliable. – Marc.2377 Jul 19 '19 at 06:52
  • @xdtTransform Can you please explain? – Arpit Gupta Jul 19 '19 at 06:52
  • Is this using the `TcpClient` or the `Socket` classes from `System.Net.Sockets`? – Marc.2377 Jul 19 '19 at 07:02
  • @Marc.2377 - So according to you Is it okay to use File.Move or File.Copy for large file transfer between Servers? – Arpit Gupta Jul 19 '19 at 07:02
  • I was using RoboCopy as exemple as it's the most famous one and build in W10 now. When facing problem like those, I will ask my self : How do the system(os) does that kind of thing?, then: "May I call this process too?". -https://superuser.com/questions/773090/is-there-a-way-to-copy-with-verification-or-just-verify-copied-data – xdtTransform Jul 19 '19 at 07:03
  • 1
    Well, I can guarantee it's at least as OK as what you are currently doing. Also, note that [`a Move` is only a *real* move if it's in the same volume](https://stackoverflow.com/q/30022001/3258851). – Marc.2377 Jul 19 '19 at 07:03
  • @Marc.2377 I there any chance that File.Copy will occupy the bandwidth as I will be transferring from one server to another? – Arpit Gupta Jul 19 '19 at 13:34
  • @ArpitGupta Yes, I'm afraid that's quite likely. – Marc.2377 Jul 19 '19 at 19:36

0 Answers0