2

This is very similar to this question, but the answers don't really answer this, so I thought I'd ask again:

Sometimes I interact with functions that return variable-length structures; for example, FSCTL_GET_RETRIEVAL_POINTERS in Windows returns a variably-sized RETRIEVAL_POINTERS_BUFFER structure.

Using malloc/free is discouraged in C++, and so I was wondering:
What is the "proper" way to allocate variable-length buffers in standard C++ (i.e. no Boost, etc.)?

vector<char> is type-unsafe (and doesn't guarantee anything about alignment, if I understand correctly), new doesn't work with custom-sized allocations, and I can't think of a good substitute. Any ideas?

Community
  • 1
  • 1
user541686
  • 205,094
  • 128
  • 528
  • 886

6 Answers6

5

I would use std::vector<char> buffer(n). There's really no such thing as a variably sized structure in C++, so you have to fake it; throw type safety out the window.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • But then it's not even guaranteed to be e.g. properly aligned, is it? It seems so much hackier than `malloc`... – user541686 Sep 14 '11 at 19:48
  • @Mehrdad, I don't know if `vector` is guaranteed to align its internal buffer but in practice I believe it does on common compilers. You can always use a custom allocator if you need guarantees. – Mark Ransom Sep 14 '11 at 19:51
  • @Mehrdad You're going to be responsible for alignment regardless. C++11 has more tools to help enforce it, though. – Tom Kerr Sep 14 '11 at 19:51
  • @TomK: Uhm, what? `malloc` guarantees that its alignment is correct for any built-in type. `vector` doesn't. I'm not sure what you mean by being responsible for it regardless... I certainly pay no attention to it when using `malloc`. – user541686 Sep 14 '11 at 19:52
  • 1
    @Mehrdad: It's aligned correctly. Dynamic allocations are guaranteed to be suitably aligned for any type. – GManNickG Sep 14 '11 at 19:56
  • @GMan: Wait, so `new char[sizeof(long long)]` is guaranteed to be properly aligned for `long long`? *Mind blown* if that's the case... – user541686 Sep 14 '11 at 20:00
  • 1
    @Mehrdad, why do you think `new` would offer fewer guarantees than `malloc`? – Mark Ransom Sep 14 '11 at 20:05
  • @Mehrdad: Correct. See jpalecek's answer. (Note that the internal buffer of the vector was allocated with `operator new`, so the alignment requirement still stands.) – GManNickG Sep 14 '11 at 20:08
  • @GMan: the default allocator provides an aligned buffer, because `new` does, but is `vector` guaranteed to put the zeroth element at the start of the buffer it gets from the allocator? What about other containers, for example is it guaranteed that an element of a `list` is `long long`-aligned? That's allocated with `new` too, albeit there's typically an extra `rebind` in there compared with `vector`. Not that you could fit a `long long` in that list element, of course, I'm just wondering what part of the container/allocator interaction guarantees it. – Steve Jessop Sep 14 '11 at 21:00
  • ... and although a `long long` doesn't fit in the `char`, it might fit in the list node as a whole, and what about the elements of a list of some struct of size >= sizeof(long long), are they guarateed aligned for long long? That is, is it all containers that guarantee their first element's address is offset from an allocated buffer's address by a suitably-aligned distance, or just `vector`? – Steve Jessop Sep 14 '11 at 21:10
  • @Steve: Good point, I can try to remember to do some investigating tonight. – GManNickG Sep 15 '11 at 01:10
2

I don't see any reason why you can't use std::vector<char>:

{
   std::vector<char> raii(memory_size); 
   char* memory = &raii[0];

  //Now use `memory` wherever you want
  //Maybe, you want to use placement new as:

   A *pA = new (memory) A(/*...*/); //assume memory_size >= sizeof(A);
   pA->fun();
   pA->~A(); //call the destructor, once done!

}//<--- just remember, memory is deallocated here, automatically!

Alright, I understand your alignment problem. It's not that complicated. You can do this:

A *pA = new (&memory[i]) A();
//choose `i` such that `&memory[i]` is multiple of four, or whatever alignment requires
//read the comments..
Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • What are the alignment guarantees? Is the address guaranteed to be aligned correctly? – user541686 Sep 14 '11 at 19:51
  • @Mehrdad: That is the job of placement `new`, and how you want to use it. – Nawaz Sep 14 '11 at 19:51
  • Right, but how do I allocate the memory for placement `new` in the first place? – user541686 Sep 14 '11 at 19:53
  • I'm confused... how exactly does placement `new` handle alignment issues? – user541686 Sep 14 '11 at 19:56
  • @Mehrdad: How does `new` handle? – Nawaz Sep 14 '11 at 19:57
  • `new` is *allocating* the memory, so obviously it can get an aligned address if it wants to. Placement `new` obviously can't... – user541686 Sep 14 '11 at 19:59
  • +1 so long as `raii` stays in scope for duration of use through `memory` – AJG85 Sep 14 '11 at 20:01
  • @Mehrdad: What does `can get an aligned address` mean? Doesn't `sizeof(A)` return the amount of memory required to construct an object of type `A`? Doesn't `sizeof()` handles alignment and all? – Nawaz Sep 14 '11 at 20:01
  • @Nawaz: Argh... if `new` is responsible for allocating memory, then obviously it has the control to do so in an aligned fashion. I don't see how placement `new` could do anything similar... – user541686 Sep 14 '11 at 20:02
  • @Mehrdad: How does `new` know how much memory needs to be allocated? – Nawaz Sep 14 '11 at 20:03
  • @Nawaz: I think you're misunderstanding the issue. The issue isn't about the *internal alignment* of the structure, but of the structure itself. – user541686 Sep 14 '11 at 20:04
  • 1
    @Mehdrad placement new does not allocate it *places* things in previously allocated memory which the vector does for you. – AJG85 Sep 14 '11 at 20:04
  • @Mehrdad: Yes, it seems I'm not getting what you're saying. Can you explain your point? I mean, if I've memory of size >= `sizeof(A)`, then what problem `new` might face while constructing the object? – Nawaz Sep 14 '11 at 20:07
  • 1
    @Nazaz: `struct A {int data[3];};` must be aligned on a multiple of `sizeof(int)`, not `sizeof(A)`. If you allocate sizeof(A) bytes, but it doesn't begin on a (for 32 bit machines) byte multiple of four, everything dies. (UB I think) – Mooing Duck Sep 14 '11 at 20:10
  • @Mooing: Why can't we make it begin on a byte multiple of four? Can't we do `new (&memory[i]) A()`, where we choose `i` such that `&memory[i]` is multiple of four? – Nawaz Sep 14 '11 at 20:15
  • @Nawaz: Then you have to keep track of the offset as well, for deletion. Major pain in the rear. – user541686 Sep 14 '11 at 20:56
  • @AJG85: Exactly, so it can't manage the alignment of the memory for you, requiring you to worry about it. That's **exactly** the problem with it. – user541686 Sep 14 '11 at 21:06
  • 1
    @Mehrdad: But if you give it an address that's already aligned there's no issue, which is the case here. – GManNickG Sep 15 '11 at 01:10
  • @GMan: The entire point was to avoid manual/ugly work. Obviously, I can call the system functions directly too, but is that a good idea? – user541686 Sep 15 '11 at 03:01
  • @Mehrdad: I'm not sure what you mean by call system functions directly. – GManNickG Sep 15 '11 at 06:27
  • @GMan: Forget it, my point is that this solution isn't practical. – user541686 Sep 15 '11 at 06:34
2

If you like malloc()/free(), you can use

RETRIEVAL_POINTERS_BUFFER* ptr=new char [...appropriate size...];

... do stuff ...

delete[] ptr;

Quotation from the standard regarding alignment (expr.new/10):

For arrays of char and unsigned char, the difference between the result of the new-expression and the address returned by the allocation function shall be an integral multiple of the strictest fundamental alignment requirement (3.11) of any object type whose size is no greater than the size of the array being created. [ Note: Because allocation functions are assumed to return pointers to storage that is appropriately aligned for objects of any type with fundamental alignment, this constraint on array allocation overhead permits the common idiom of allocating character arrays into which objects of other types will later be placed. — end note ]

jpalecek
  • 47,058
  • 7
  • 102
  • 144
  • Ooooh... wow, this is really cool. So the result of `new` is *always* properly aligned for primitive types, no matter what type you allocated it as? – user541686 Sep 14 '11 at 20:01
  • Only for `new char[n]` (or `unsigned char`). – jpalecek Sep 14 '11 at 20:05
  • @Mehrdad: Not just primitive types, *all* types. – GManNickG Sep 14 '11 at 20:08
  • @Mehrdad: that makes the alignment always properly aligned for _all_ types. A class aligns on the same alignment as it's member with the strictest alignment, eventually that's a primitive type. – Mooing Duck Sep 14 '11 at 20:08
  • @GMan, Mooing Duck: It can't be all types, because compiler extensions can make alignments of like 64, which it doesn't handle. But yeah, if you disregard that then it works I guess. – user541686 Sep 14 '11 at 20:57
  • @jpalecek: Ah I see; I didn't know that, that's a great solution. Thanks! – user541686 Sep 14 '11 at 20:57
  • @Mehrdad: that's a fudge by implementers, though. The standard quite clearly says "any object type", not "any object type excluding object types that are, or have members, of types provided as compiler extensions". But to avoid wasting a bunch of memory by (for example) 64-aligning everything, implementations just say (when challenged) that all bets are off as soon as your program uses any extension, and their SIMD extensions are implementation-defined to require special measures to ensure alignment. – Steve Jessop Sep 14 '11 at 21:08
0

You may consider using a memory pool and, in the specific case of the RETRIEVAL_POINTERS_BUFFER structure, allocate pool memory amounts in accordance with its definition:

sizeof(DWORD) + sizeof(LARGE_INTEGER)

plus

ExtentCount * sizeof(Extents)

(I am sure you are more familiar with this data structure than I am -- the above is mostly for future readers of your question).

A memory pool boils down to "allocate a bunch of memory, then allocate that memory in small pieces using your own fast allocator". You can build your own memory pool, but it may be worth looking at Boosts memory pool, which is a pure header (no DLLs!) library. Please note that I have not used the Boost memory pool library, but you did ask about Boost so I thought I'd mention it.

Charles Burns
  • 10,310
  • 7
  • 64
  • 81
0

std::vector<char> is just fine. Typically you can call your low-level c-function with a zero-size argument, so you know how much is needed. Then you solve your alignment problem: just allocate more than you need, and offset the start pointer:

Say you want the buffer aligned to 4 bytes, allocate needed size + 4 and add 4 - ((&my_vect[0] - reinterpret_cast<char*>(0)) & 0x3).

Then call your c-function with the requested size and the offsetted pointer.

Jan
  • 1,807
  • 13
  • 26
0

Ok, lets start from the beginning. Ideal way to return variable-length buffer would be:

MyStruct my_func(int a) { MyStruct s; /* magic here */ return s; }

Unfortunately, this does not work since sizeof(MyStruct) is calculated on compile-time. Anything variable-length just do not fit inside a buffer whose size is calculated on compile-time. The thing to notice that this happens with every variable or type supported by c++, since they all support sizeof. C++ has just one thing that can handle runtime sizes of buffers:

MyStruct *ptr = new MyStruct[count];

So anything that is going to solve this problem is necessarily going to use the array version of new. This includes std::vector and other solutions proposed earlier. Notice that tricks like the placement new to a char array has exactly the same problem with sizeof. Variable-length buffers just needs heap and arrays. There is no way around that restriction, if you want to stay within c++. Further it requires more than one object! This is important. You cannot make variable-length object with c++. It's just impossible.

The nearest one to variable-length object that the c++ provides is "jumping from type to type". Each and every object does not need to be of same type, and you can on runtime manipulate objects of different types. But each part and each complete object still supports sizeof and their sizes are determined on compile-time. Only thing left for programmer is to choose which type you use.

So what's our solution to the problem? How do you create variable-length objects? std::string provides the answer. It needs to have more than one character inside and use the array alternative for heap allocation. But this is all handled by the stdlib and programmer do not need to care. Then you'll have a class that manipulates those std::strings. std::string can do it because it's actually 2 separate memory areas. The sizeof(std::string) does return a memory block whose size can be calculated on compile-time. But the actual variable-length data is in separate memory block allocated by the array version of new.

The array version of new has some restrictions on it's own. sizeof(a[0])==sizeof(a[1]) etc. First allocating an array, and then doing placement new for several objects of different types will go around this limitation.

tp1
  • 1,197
  • 10
  • 17