0

I am writing the benchmarking for file IO operation in Go.

I want to avoid any OS level asynchronization or caching while performing simple IO operations like create, open, read, write, delete file.

from this SO question When to flush a file in Go? I got to know, simple writing is not enough as it doesn't guarantee an actual disk operation completed.

Thus I changed my code as

start := time.Now()
_, e = file.Write(b)
if e == nil {
    e = file.Sync()
}
elapsed := time.Since(start)

I am assuming this will take care of write operation.

For Create, Read, and Open operation, I am going with my intuition that if relevant operations are done without any errors, an actual disk operation would have performed. Please correct my understanding here if it is wrong.

My concern is for closing and deleting the file.

I checked this Safely close a file descriptor in golang, but could not make much sense in my context

Currently, my file closing code looks like

start := time.Now()
e = file.Close()
elapsed := time.Since(start)

And I am using os.Remove(fileName) to delete the file.

I can think about possible OS level asynchronization for close/delete operation as well in a way when program calls to close/delete the file, the file instance would be marked to be closed or deleted and actual disk operation would be done when OS decides.

Can sync method be helpful in closing the file?

What is better way to guarantee that the file is deleting (without putting much overhead - like if file exist loop right after deleting)

Amit
  • 1,821
  • 1
  • 17
  • 30
  • 4
    If you Sync between the last Write and Close, then Close is not a disk operation at all. It simply releases some kernel resources and makes the file descriptor available for re-use. Whether Sync actually commits to hardware depends on the hardware. Some disks have write caches, for instance. You can probably find some good information on this in the database space. These folks have to worry about this sort of stuff a lot. – Peter Jun 13 '23 at 11:01
  • @Peter Does that mean, "close" alone (without prior write operation) doesn't need to have any mechanism (like sync) to make sure it happens complete synchronized way ? And if you have any better idea for delete operation as well. – Amit Jun 13 '23 at 11:04
  • 2
    There is nothing to synchronize in a Close call. It just tells the kernel "this process is done with this file descriptor". Delete is inherently asynchronous (on *nix, that is; I don't know enough about Windows to speak to that). It doesn't do anything on disk if there are other file descriptors, for instance. But you can be sure that no new file descriptors can be created after Remove returns no error. – Peter Jun 13 '23 at 11:25
  • @CharlieTumahai Yes, you are correct, thanks for pointing that out. I have corrected that. – Amit Jun 13 '23 at 12:56
  • 1
    «I want to avoid any OS level asynchronization or caching while performing simple IO operations» I might have misinterpreted your demand (looks like English is not your native language) but if you want to defeat all caching done by the OS, you have to use the so-called "direct" file access, which requires opening the file with `O_DIRECT` flag, and doing I/O using page-aligned and sized chunks. That is, it appears your initial premises are not quite correct. – kostix Jun 13 '23 at 12:59
  • @Peter with my experience with `.Net` and `System.IO` , delete happens on OS asynchronously. However there is no direct way to make sure that file is actually deleted before moving to next line of code. – Amit Jun 13 '23 at 13:07
  • 1
    Please note that file deletion is not really different from reads and writes when a filesystem layer is involved. If you want to be sure deletion _on the FS level_ has happened, you have to `fsync` the file desctiptor of the opened _directory_ (when we're talking about Linux; not sure about Windows and NTFS). Still, this does not guarantee the operation has actually changed the relevant bits on the medium underlying the filesystem. The kernel simply has no standardized way to figure this out. – kostix Jun 13 '23 at 13:50
  • 1
    "I am writing the benchmarking for file IO operation in Go." This is an _incredible _ complicated task. It was untrivial in the 80, hard in the 90 and almost impossible today. "Impossible" in the sense of: "impossible to actionable numbers and learnings out of such a measurement with too many moving parts". – Volker Jun 13 '23 at 14:47
  • @Peter, regarding the «a Close call. It just tells the kernel "this process is done with this file descriptor"» point–while true, it may have subtle semantics with regard to diagnosing possible _prior_ errors, which is detailed in the «Dealing with error returns from close()» in the [Linux manual page](https://manpages.debian.org/2/close). – kostix Jun 14 '23 at 15:12

0 Answers0