0

Just wondering, what would be the most efficient way to write same data to two files, on linux and C/C++.

For example, this is the most trivial way.

while(1) {

    ... getting data from somewhere ....

    write(fd1, data, datalen);
    write(fd2, data, datalen);
}

However, the disadvantage is that kernel needs to copy data twice even though the data is same.

Any thoughts?

gsamaras
  • 71,951
  • 46
  • 188
  • 305
mesibo
  • 3,970
  • 6
  • 25
  • 43
  • 3
    There is no C/C++. Choose *one* language you are working in. – Baum mit Augen Oct 12 '17 at 11:07
  • 4
    Sounds like [premature optimization](https://xkcd.com/1691/). Do you have any proof that this is a relevant bottleneck in your code? I don't think so. As an answer would heavily depend on the OS, the filesystem and their inner workings, just stick to the obvious solution. Side note: You'd probably do it differently in C++, so don't double tag languages... –  Oct 12 '17 at 11:07
  • The answer would probably depend also on the *filesystem* you are using. – Pac0 Oct 12 '17 at 11:07
  • 2
    @yumoji, why are you duplicating data writing anyway ? In addition to look like premature optimization as mentioned by Felix, it does also smell like [X/Y problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). – Pac0 Oct 12 '17 at 11:09
  • 1
    Just write to one file. When done copy it to the second one. – zdf Oct 12 '17 at 11:19
  • 2
    How would the kernel *not* write data *twice* if you need to write it at two different locations ? – Pac0 Oct 12 '17 at 11:19
  • 1
    A hard link or even symbolic link comes into my mind if the same data has to appear as two files... – Scheff's Cat Oct 12 '17 at 11:21
  • [`man 2 link`](http://man7.org/linux/man-pages/man2/link.2.html) – ceving Oct 12 '17 at 11:52
  • If it's really an issue, check out your OS to find out what API's it has for delivering user buffers directly to the drivers, (you will have to handle an I/O completion callback, later). OTOH, if you are using C++, the buffer has probably already been value-copied 10 times, so another copy won't make much difference. – Martin James Oct 12 '17 at 12:44
  • How about write to one file, then copy the file with a system call? – Paul Ogilvie Oct 12 '17 at 12:46

1 Answers1

1

what would be the most efficient way to write same data to two files

  1. Write the data to one file only.
  2. Copy that file to another. Use an OS call to do that efficiently (Copy a file in a sane, safe and efficient way).

Another way for step 2 would be to create a hard link (check link()).


However, please watch out of not becomning a victim of premature optimization. I this is not the bottleneck in your program, then just use the trivial, easy-it-read approach.

gsamaras
  • 71,951
  • 46
  • 188
  • 305
  • 1
    or create a hard link. But without any more specification for the op, hard to tell if suits his scenario. – Pac0 Oct 12 '17 at 11:29
  • 1
    "Write same data to two files" doesn't mean that the files will have the same content at the end. – Jean-Baptiste Yunès Oct 12 '17 at 12:03
  • I see your point @Jean-BaptisteYunès, but the OP gave me this impression. Let's see what he will have to say for my answer. – gsamaras Oct 12 '17 at 12:08
  • btw, that's a bad way to copy file in linux - use sendfile. I have special needs and wanted suggestions - personally I would like if people stay focus on discussing the point rather than jumping gun about premature optimization. Sorry but disappointed with quality of comments here. – mesibo Oct 12 '17 at 13:19
  • 2
    @yumoji how is `sendfile()` a bad way on linux (except for non-portability)? If anything is disappointing, it's the quality of the question -- if you have "*special needs*", you should explain them, so you could get useful suggestions. –  Oct 12 '17 at 14:07
  • 1
    @yumoji I feel that I answered your question. You asked how to do it, and I suggested my approach. The premature optimization comment was just a side note. – gsamaras Oct 13 '17 at 05:24
  • Felix, I said the copy approach is bad - not the sendfile. – mesibo Oct 13 '17 at 05:38