3

I have this code (more or less):

resp, err := http.Get(url)
if err != nil {
    // handle error
}

if resp.StatusCode != http.StatusOK {
    // handle error
}

out, err := os.Create(filepath)
if err != nil {
    return err
}

// Write the body to file
_, err = io.Copy(out, resp.Body)
resp.Body.Close()
out.Close()

My issue is that if I immediately try to do something (e.g. take the hash of this file), then I see that it is still copying for a while.

At first I was deferring the out.Close(), and I though that I need to out.Close after the io.Copy, which will block until its done with it. or so I thought. This didn't work and I still have the same issue.

How do I block or wait for the io.Copy operation to finish?

Thanks!

Roy Ca
  • 473
  • 1
  • 3
  • 13
  • +1 for what @JimB says. If that does not solve your problem, perhaps you can try calling [`Sync`](https://pkg.go.dev/os#File.Sync) on `out` to force a `fsync` syscall. – Emile Pels Jul 21 '21 at 19:26
  • `How do I block or wait for the io.Copy operation to finish?` You write instructions to synchronize accesses. Or in a way such calling that function will block until its work is done before proceeding to this other function. I feel like you have not told the whole story. –  Jul 21 '21 at 19:40
  • sometimes we dont need to synchronize, we can just do all the things in one pass. https://stackoverflow.com/questions/62319053/using-io-copy-with-io-teereader/62321039#62321039 –  Jul 21 '21 at 19:41
  • The thing is that if I wait it works. and if I sample the file in a loop I see it changes until it stabilizes (finished) - e.g. take hash in a loop. It still failed, I added sync and it seems to work (its hard to be sure, this isn't 100% reproducible). is there any reason sync should be called? why would I need to force it, seems like its expected to happen anyway. @EmilePels - you can write it as an answer and i'll accept it – Roy Ca Jul 21 '21 at 19:45
  • @RoyCa glad it helped. I have submitted an answer with a slightly more elaborate explanation, and I have also referenced some docs there. – Emile Pels Jul 21 '21 at 19:52

1 Answers1

4

Likely you are hitting some disk buffer/cache, where your OS or disk device keeps some data in memory before actually persisting the write to the disk.

Calling

out.Sync()

forces a fsync syscall, which will instruct the OS to force a flush of the buffer and write the data to disk. I suggest calling out.Flush() after the io.Copy call returns.

Related docs you may find interesting:

Emile Pels
  • 3,837
  • 1
  • 15
  • 23
  • 1
    I'm *really* surprised `out.Close()` wouldn't implicitly do a sync if there's unwritten buffered data. This seems to expose internals that a user of `io.Copy` etc. should not have to worry about. – colm.anseo Jul 21 '21 at 21:04
  • @colm.anseo: something I had forgotten myself over the years, https://stackoverflow.com/questions/705454/does-linux-guarantee-the-contents-of-a-file-is-flushed-to-disc-after-close – JimB Jul 21 '21 at 21:54