29

I have some code where I frequently copy a large block of memory, often after making only very small changes to it.

I have implemented a system which tracks the changes, but I thought it might be nice, if possible to tell the OS to do a 'copy-on-write' of the memory, and let it deal with only making a copy of those parts which change. However while Linux does copy-on-write, for example when fork()ing, I can't find a way of controlling it and doing it myself.

Chris Jefferson
  • 7,225
  • 11
  • 43
  • 66

6 Answers6

19

Your best chance is probably to mmap() the original data to file, and then mmap() the same file again using MAP_PRIVATE.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • Note that you need to create two `MAP_PRIVATE` mappings - COW semantics require all users to have COW copies, with no-one using a "master" copy. Unfortunately the file itself seems to be necessary. – caf Oct 14 '09 at 12:11
  • Why? Assume the master is `AA`, and the reason for COW is that you want a copy which you can change to `AB`. There's no reason the original `AA` needs to be a private mapping, as nobody is planning to write to it. It's merely a template. – MSalters Oct 14 '09 at 12:16
  • 1
    My comment was based on the possibility that the "original" copy may also be written to, in which case it would be unspecified if those changes get reflected in the COW copy or not. As an aside, it is a pity that `mmap` does not provide inherent support for this - I might play around with adding support to `mmap` for duplicating existing mappings and see how it goes. – caf Oct 14 '09 at 22:20
  • I'm with MSalters: there's no "standard" set of COW semantics. Having one mapping be the "real" file and one be a private copy seems perfectly reasonable. Obviously some apps need writable snapshots or whatnot, but not all. – Andy Ross Oct 15 '09 at 23:07
  • [memfd_create](http://man7.org/linux/man-pages/man2/memfd_create.2.html) can be used to work around the need to create a file but you still need to allocate the original data memfd-backed memory to cow it. – the8472 Jul 04 '18 at 20:31
4

Depending on what exactly it is that you are copying, a persistent data structure might be a solution for your problem.

Jørgen Fogh
  • 7,516
  • 2
  • 36
  • 46
2

The copy-on-write mechanism employed e.g. by fork() is a feature of the MMU (Memory Management Unit), which handles the memory paging for the kernel. Accessing the MMU is a priviledged operation, i.e. cannot be done by a userspace application. I am not aware of any copy-on-write API exported to user-space, either.

(Then again, I am not exactly a guru on the Linux API, so others might point out relevant API calls I have missed.)

Edit: And lo, MSalters rises to the occasion. ;-)

DevSolar
  • 67,862
  • 21
  • 134
  • 209
2

Its easier to implement copy-on-write in a object oriented language, like c++. For example, most of the container classes in Qt are copy-on-write.

But if course you can do that in C too, it's just some more work. When you want to assign your data to a new data block, you don't do a copy, instead you just copy a pointer in a wrapper strcut around your data. You need to keep track in your data blocks of the status of the data. If you now change something in your new data block, you make a "real" copy and change the status. You can't of course no longer use the simple operators like "=" for assignment, instead need to have functions (In C++ you would just do operator overloading).

A more robust implementation should use reference counters instead of a simple flag, I leave it up to you.

A quick and dirty example: If you have a

struct big {
//lots of data
    int data[BIG_NUMBER];
}

you have to implement assign functions and getters/setters yourself.

// assume you want to implent cow for a struct big of some kind
// now instead of
struct big a, b;
a = b;
a.data[12345] = 6789;

// you need to use
struct cow_big a,b;
assign(&a, b);   //only pointers get copied
set_some_data(a, 12345, 6789); // now the stuff gets really copied


//the basic implementation could look like 
struct cow_big {
    struct big *data;
    int needs_copy;
}

// shallow copy, only sets a pointer. 
void assign(struct cow_big* dst, struct cow_big src) {
    dst->data = src.data;
    dst->needs_copy = true;
}

// change some data in struct big. if it hasn't made a deep copy yet, do it here.
void set_some_data(struct cow_big* dst, int index, int data } {
    if (dst->needs_copy) {
        struct big* src = dst->data;
        dst->data = malloc(sizeof(big));
        *(dst->data) = src->data;   // now here is the deep copy
       dst->needs_copy = false;
   }
   dst->data[index] = data;
}

You need to write constructors and destructors as well. I really recommend c++ for this.

Gunther Piez
  • 29,760
  • 6
  • 71
  • 103
  • 2
    That doesn't generate the COW semantics that I want, if the OS did it it would only copy the (on Mac OS X at least) 4k page which was changed, leaving the rest of the (other 10s or 100s of MB) data-structure still COW. Of course, I could, and have, implemented what I actually want, but it would be nice if I could get the OS to do it for me. – Chris Jefferson Oct 15 '09 at 10:29
  • 2
    A newer version of the linux kernel may do it automagically, the kernels 2.6.32+ have code for replacing duplicate pages with copy-on-write links http://lwn.net/Articles/353501/ , yet the ksm subsystem is not very mature and until now works the other way around: The pages are scanned after they have been copied and replaced if identical. If you want it to control it from userspace, you may want to look at linux/mm/ksm.c and make the changes you need. – Gunther Piez Oct 15 '09 at 21:53
  • 4
    The posted solution really isn't "CoW" at all, it's a software emulation thereof that forces all "write" operations through an indirection layer. I believe Chris was asking specifically for a memory-level solution using the MMU hardware. And FWIW: you don't need a new version of the Linux kernel. BSD mmap() has supported MAP_PRIVATE for decades now -- it's been part of POSIX since the beginning. – Andy Ross Oct 15 '09 at 23:10
1

Here's a working example:

#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#define SIZE 4096

int main(void) {
  int fd = shm_open("/tmpmem", O_RDWR | O_CREAT, 0666);
  int r = ftruncate(fd, SIZE);
  printf("fd: %i, r: %i\n", fd, r);
  char *buf = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
      MAP_SHARED, fd, 0);
  printf("debug 0\n");
  buf[SIZE - 2] = 41;
  buf[SIZE - 1] = 42;
  printf("debug 1\n");

  // don't know why this is needed, or working
  //r = mmap(buf, SIZE, PROT_READ | PROT_WRITE,
  //  MAP_FIXED, fd, 0);
  //printf("r: %i\n", r);

  char *buf2 = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
    MAP_PRIVATE, fd, 0);
  printf("buf2: %i\n", buf2);
  buf2[SIZE - 1] = 43;
  buf[SIZE - 2] = 40;
  printf("buf[-2]: %i, buf[-1]: %i, buf2[-2]: %i, buf2[-1]: %i\n",
      buf[SIZE - 2],
      buf[SIZE - 1],
      buf2[SIZE - 2],
      buf2[SIZE - 1]);

  unlink(fd);
  return EXIT_SUCCESS;
}

I'm a little unsure of whether I need to enable the commented out section, for safety.

fadedbee
  • 42,671
  • 44
  • 178
  • 308
  • For me this crashes on the second call to mmap. I'd be interested to know if you subsequently used this code, or an improved version of it , as I have a similar requirement for copy-on-write in C code ? (P.S. note that the call to unlink looks wrong (unlink takes a string, not a file descriptor)). – Paul R Mar 12 '15 at 14:15
1

You should be able to open your own memory via /proc/$PID/mem and then mmap() the interesting part of it with MAP_PRIVATE to some other place.

user175104
  • 3,598
  • 2
  • 23
  • 20
  • 3
    This will not work as /proc.../mem does not support mmap. See also [here](http://stackoverflow.com/questions/5216326/mmap-on-proc-pid-mem). – coltox Sep 27 '13 at 14:51