2

I'm working with some older code that mallocs a chunk of RAM then loads a binary file into it. The binary file is a series of 8-bit greyscale image planes X by Y size, Z planes deep. The files are usually between 500 megabytes and 10 gigabytes.

The existing code uses a complex pointer arrangement to access individual planes, in either of XY, XZ or YZ planes.

What I'd like to do is replace the pointer with a single vector of vectors, where each sub-vector is an XY plane in the data. The aim for doing this is to gain some of the safety and checking you get with vectors and don't with raw pointer access.

Based on this previous question (Is it possible to initialize std::vector over already allocated memory?) I have the following code

//preallocate.h
template <typename T>
class PreAllocator
{
private:
T* memory_ptr;
std::size_t memory_size;

public:
typedef std::size_t     size_type;
typedef T*              pointer;
typedef T               value_type;

PreAllocator(T* memory_ptr, std::size_t memory_size) : memory_ptr(memory_ptr), memory_size(memory_size) {}

PreAllocator(const PreAllocator& other) throw() : memory_ptr(other.memory_ptr), memory_size(other.memory_size) {};

template<typename U>
PreAllocator(const PreAllocator<U>& other) throw() : memory_ptr(other.memory_ptr), memory_size(other.memory_size) {};

template<typename U>
PreAllocator& operator = (const PreAllocator<U>& other) { return *this; }
PreAllocator<T>& operator = (const PreAllocator& other) { return *this; }
~PreAllocator() {}


pointer allocate(size_type n, const void* hint = 0) {return memory_ptr;}
void deallocate(T* ptr, size_type n) {}

size_type max_size() const {return memory_size;}
};

The simplified main function looks like this:

TOMhead header;
uint8_t* TOMvolume;
size_t volumeBytes = 0;

int main(int argc, char *argv[])
{
    std::fstream TOMfile;
    std::ios_base::iostate exceptionMask = TOMfile.exceptions() | std::ios::failbit| std::ifstream::badbit;
    TOMfile.exceptions(exceptionMask);

    try {
        TOMfile.open(argv[1],std::ios::in|std::ios::binary);
    }
    catch (std::system_error& error) {
        std::cerr << error.code().message() << std::endl;
        return ERROR;
    }

    TOMfile.read((char*) &header, sizeof(header));
    if (!TOMfile)
    {
        std::cout<<"Error reading file into memory, expected to read " << sizeof(header) << " but only read " << TOMfile.gcount() << "bytes" <<std::endl;
        return ERROR;
    }


    TOMfile.seekg(std::ios_base::beg);      // rewind to begining of the file
    TOMfile.seekg(sizeof(header));          // seek to data beyond the header

    volumeBytes = (header.xsize * header.ysize * header.zsize);

    std::cout << "Trying to malloc " << volumeBytes << " bytes of RAM" << std::endl;

    TOMvolume = (uint8_t*) malloc(volumeBytes);
    if (TOMvolume == NULL)
    {
        std::cout << "Error allocating RAM for the data" << std::endl;
        return ERROR;
    }
TOMfile.read((char*) TOMvolume,volumeBytes);

I've tried then using the pre-allocator to create a vector that holds this malloc'ed data

std::vector<uint8_t, PreAllocator<uint8_t>> v_TOMvolume(0, PreAllocator<uint8_t>(&TOMvolume[0], volumeBytes));
v_TOMvolume.push_back(volumeBytes);

but any attempt to read the size of the vector or any data in the vector fails. The data is correct in memory when I just use the debugger to view it, it's just not getting associated with the vector, as I'd like it.

Any thoughts? Is what I'm trying to do possible?

David
  • 23
  • 5
  • 1
    It sounds to me like you are trying to use this allocator scheme to read the preallocated memory. It's just meant to reuse already allocated memory, not read it. Every time you `push_back(volumeBytes)` you will instead overwrite 1 byte of your mapped memory with whatever value `volumeBytes` has. If your goal was to read the memory in `TOMvalue` then this approach will not work. – François Andrieux Mar 17 '20 at 19:39
  • Instead of `malloc`ing memory and later trying to map it to a `std::vector` why not just create a properly sized `std::vector` and read directly into it? – François Andrieux Mar 17 '20 at 19:41
  • 1
    Or how about using Boost.MultiArray's `boost::multi_array_ref` (whether you really want a multidimensional array or just 1-D slices). – aschepler Mar 17 '20 at 19:51
  • @aschepler I'm considering moving over to boost for this, but when I last looked, there was no obvious way to read a file directly into a multi array unlike standard vectors. – David Mar 18 '20 at 16:25
  • @FrançoisAndrieux I'm aiming to have a chunk of already filled in memory look like and be accessible by a vector. I tried several ways, but can't make the vector realise it already points to allocated data, so it thinks it has zero size. – David Mar 18 '20 at 16:27
  • @David `std::vector` is not compatible with what you want to do. Read from the file directory to an existing `std::vector` instead. – François Andrieux Mar 18 '20 at 16:41

1 Answers1

2

It is not possible to allocate memory to a vector while keeping the previous content of the memory.

A working approach:

  • Don't use malloc at all.
  • Create a vector with default allocator, with necessary size.
  • Load the binary file directly into the vector.

In a hypothetical case where you cannot touch the allocation part because it is somewhere deep in the library: Just don't use a vector. You already have a dynamic array. Iterator based algorithms work just fine with pointers. For range based algorithms, you need something like std::span (C++20) or similar.

But using vector for the allocation would be safer and thus better.

If your files are up to 10 GB, then I would suggest trying out memory mapping the file instead. Mapped memory also cannot be used as storage of a vector, so the approach of not using a vector should be taken. Unfortunately though, there is no standard way to memory map files.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • Thank you. I think memory mapped files will be the way this ends up going, especially as newer data files are tending towards the larger sizes. I'll need to investigate how compatible this is - ideally I need to target windows and macos. – David Mar 18 '20 at 16:23