0

I am working on a program to stack astronomical images. It will involve having to store hundreds of 16 bit RGB images in memory for processing. I can not just handle the images one at a time since in some cases I want to take the median value for each pixel. I decided to create temporary files and mmap my images to them so I can use my image api like nothing changed between the image being in normal memory or being mmaped since the kernel should handle accessing the necessary parts of the file behind the scenes. I am posting this since I just wanted to make sure I am doing it correctly. Will this approach make it possible to load 50gb of images (hypothetically) if I only have 16GB of ram + my 8GB swap partition?

typedef struct
{
    size_t w;
    size_t h;
    pixel_t px[];
} image_t;

//allocate in normal memory
image_t* image_new(size_t w, size_t h)
{
    assert(w && h);
    image_t* img = malloc(sizeof(image_t) + sizeof(pixel_t) * w * h);
    if(!img)
        return NULL;

    img->w = w;
    img->h = h;

    memset(img->px, 0, sizeof(pixel_t) * w * h);

    return img;
}

image_t* image_mmap_new(image_t* img)
{
    FILE* tmp = tmpfile();
    if(tmp == 0)
        return NULL;
    size_t wsize = sizeof(image_t) + sizeof(pixel_t) * img->w * img->h;
    if(fwrite(img, 1, wsize, tmp) != wsize)
    {
        fclose(tmp);
        return NULL;
    }

    image_t* nimg = mmap(NULL, wsize, PROT_WRITE | PROT_READ, MAP_SHARED, fileno(tmp), 0);
    fclose(tmp);
    return nimg;
}
chasep255
  • 11,745
  • 8
  • 58
  • 115
  • Did you try it? I used `mmap` only with `open` and `close` and never `close`d the file descriptor before the `unmap`. – mch Nov 09 '15 at 20:57
  • This seems to be an "embarrassingly parallel" problem. Consider an approach that splits your stack into a grid of smaller subarrays. You could use a thread or queue approach to do median pixel calculations on the smaller subarrays in a manner such that the stack of subarrays fits within memory. – Alex Reynolds Nov 09 '15 at 20:58
  • A major souce of overhead in this program is converting the raw images using dcraw. I only want to do the conversion once. – chasep255 Nov 09 '15 at 21:00

1 Answers1

1

As far as i know the only limit is the address space. Once mapped, a file is accessed like a memory region and only the portions actually used are loaded into memory.

More information in this (old but yet valid) GNU Library doc: http://www.gnu.org/software/libc/manual/html_node/index.html#toc-Introduction-1

Actually it seems that I am too conservative, you can map even larger file: How big can a memory-mapped file be?

Community
  • 1
  • 1
terence hill
  • 3,354
  • 18
  • 31