3

My goal is to allocate a single chunk of memory and then partition it into smaller arrays of different types. I have a few questions about the code I've written here:

#include <iostream>
#include <cstdint>
#include <cstdlib>

int main() {
    constexpr std::size_t array_count   = 5;
    constexpr std::size_t elements_size = sizeof(std::uint32_t) + sizeof(std::uint16_t);

    void* const pv = std::calloc(array_count, elements_size);

    //Partition the memory. p32 array starts at pv, p16 array starts after the 20 byte buffer for the p32 array.
    std::uint32_t* const p32 = (std::uint32_t *) pv;
    std::uint16_t* const p16 = (std::uint16_t *)((char *) pv + sizeof(std::uint32_t) * array_count);

    //Initialize values.
    for(std::size_t i = 0; i < array_count; ++i) {
        p32[i] = i;
        p16[i] = i * 2;
    }

    //Read them back.
    for(std::size_t i = 0; i < array_count; ++i) {
        std::cout << p32[i] << std::endl;
        std::cout << p16[i] << std::endl;
        std::cout << std::endl;
    }

    std::free(pv);
}
  1. Does this code violate c++'s strict aliasing rules? I'm having trouble finding resources on aliasing when casting pointers from a malloc or calloc call. The p32 and p16 pointers should never overlap.
  2. If I reverse the positioning of the two arrays where p16 started at pv, and p32 had a 10 byte offset from pv this could cause a segfault because uint32_t is aligned to the 4 byte boundary pv + 10 could be on the 2 byte boundary, right?
  3. Is this program unsafe, or introduce any undefined behavior that I'm missing in general? I get the expected output on my local machine, but of course that doesn't mean my code is correct.
kocica
  • 6,412
  • 2
  • 14
  • 35
Brian
  • 249
  • 2
  • 10

4 Answers4

4

Yes, the program is UB. When you do this:

for(std::size_t i = 0; i < array_count; ++i) {
    p32[i] = i;
    p16[i] = i * 2;
}

There are no uint32_t or uint16_t objects that p32 or p16 point to. calloc just gives you bytes, not objects. You can't just reinterpret_cast objects into existence. On top of that, indexing is only defined for arrays, and p32 does not point to an array.

To make it well defined, you'd have to create an array object. However, placement-new for arrays is broken, so you're left with manually initializing a bunch of uint32_ts like:

auto p32 = reinterpret_cast<uint32_t*>(pv);
for (int i = 0; i < array_count; ++i) {
    new (p32+i) uint32_t; // NB: this does no initialization, but it does satisfy
                          // [intro.object] in actually creating an object
}

This would then run into a separate issue: CWG 2182. Now we have array_count uint32_ts, but we don't have a uint32_t[array_count] so indexing is still UB. Basically, there's just no way in purely-by-the-letter-of-the-standard C++ to write this. See also my similar question on the topic.


That said, the amount of code that does this in the wild is tremendous and every implementation will allow you to do it.

Barry
  • 286,269
  • 29
  • 621
  • 977
  • After you get `n` bytes from `calloc` you can reinterpret to whatever you want. Its the normal handling of dynamicall memory i think. While you arent touching behind borders of that allocated memory you cant get UB. – kocica Aug 10 '17 at 15:10
  • @FilipKočica "Normal handling" doesn't mean it's not UB. It's still UB. There's no object. – Barry Aug 10 '17 at 15:23
  • @Deduplicator Forgot it was a void pointer, fixed that part. For the rest - well, look, like I said in the answer, every implementation will allow you to do it and allow it to just work. But it's still UB. The fact that we don't have a good way of making it non-UB is a separate problem. – Barry Aug 10 '17 at 15:25
  • @Deduplicator I don't think the __Type aliasing__ section in here allows it: http://en.cppreference.com/w/cpp/language/reinterpret_cast – Richard Critten Aug 10 '17 at 15:36
  • @Deduplicator He doesn't have *an object* at all. The entire C++ object model is based on objects - aliasing is defined based on objects. [basic.lval/8](http://eel.is/c++draft/basic.lval#8) defines that we can read through a bunch of types that are all based on the type of the object - we don't have an object, there is no such type. – Barry Aug 10 '17 at 15:48
  • 2
    Yup. Quoting `C++14 basic.life`: *"The lifetime of an object of type T begins when: — storage with the proper alignment and size for type T is obtained, and — if the object has non-trivial initialization, its initialization is complete."*. Thus to me it seems that OP's code is well-defined. – HolyBlackCat Aug 10 '17 at 16:18
  • @HolyBlackCat See above as well. We don't even have an object, so we can't even talk about when its lifetime begins. – Barry Aug 10 '17 at 16:24
  • Do you know what array-placement-new does? https://stackoverflow.com/questions/15254/can-placement-new-for-arrays-be-used-in-a-portable-way – Deduplicator Aug 10 '17 at 16:35
  • @Deduplicator There's really no conflict. Lifetime is a property of an object, you need an object to have lifetime. – Barry Aug 10 '17 at 16:36
  • @Deduplicator Otherwise, `std::aligned_storage_t<100> x;` would begin the lifetime of an infinite amount of objects. You *need* one of the other things in intro.object to happen first. Anyway, you can accept this or not but this is what the standard says. You can also see [this thread](https://groups.google.com/a/isocpp.org/d/msg/std-discussion/p4BXNhTHY7U/Qm5Z4_K-AgAJ). I'm done here tho. – Barry Aug 10 '17 at 16:45
  • Withdrawing my objection and removed comments, but the possible bookkeeping of placement-array-new invalidating your fix is still not handled: [Can placement new for arrays be used in a portable way?](https://stackoverflow.com/questions/15254/can-placement-new-for-arrays-be-used-in-a-portable-way) – Deduplicator Aug 10 '17 at 17:24
  • 1
    @Deduplicator Yep, handled that, and then added yet another ridiculous case that this runs into. – Barry Aug 10 '17 at 17:29
  • 1
    It's turtles all the way down. Argh! – Deduplicator Aug 10 '17 at 17:39
  • EWG direction is that non-allocating placement array-new should not have overhead see (EWG90/CWG476). Looks like it has fallen through the cracks since then. – T.C. Aug 10 '17 at 18:04
  • The array-placement-new isn't *portable*, but on a given system it can be *defined* how much extra space it requires as part of the traits of the type the array is over. In the case of MSVC, it allocates a count before the array if the object has a non-trivial destructor if I remember rightly, and doesn't if not. This may be preferred to doing UB. – Yakk - Adam Nevraumont Aug 10 '17 at 23:27
  • @Deduplicator: Someone needs to bonk the C and C++ Committees and compiler writers on the head until they recognize that if a simple non-optimizing compiler would naturally support a useful construct, and many have consistently done so, that construct should be officially recognized, and should not be deprecated unless or until alternatives exist that are at least as good. Unfortunately, C and C++ both seem to have gotten stuck in a rut where the fact that many implementations supported behaviors without the Standard mandating such support meant there was no need to add such support... – supercat Feb 07 '18 at 21:15
  • ...to the Standard, but the fact that the Standard never mandated such support means compiler writers feel free to unilaterally withdraw it without having to provide any usable alternative. – supercat Feb 07 '18 at 21:18
  • @supercat That somebody could be you? Anyway, this is a [relevant proposal](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0593r1.html). – Barry Feb 07 '18 at 21:58
2

I am only going to address Strict Aliasing part of the question.

C++ standard talks very little about malloc - mostly mentions it has semantic defined in C. In strict reading of C++ standard, there is no aliasing rule violation because there is no object which is aliased - in C++, lifetime of the object begins after it has been constructed, and no object has been constructed by malloc call.

As a result, this is something which is simply unspecified by Standard (as opposed by undefined).

SergeyA
  • 61,605
  • 5
  • 78
  • 137
0
  1. No it's just normal cast of malloc return value and handling allocated memory. You can reinterpret these bytes to whatever datatype you want. Until you touch memory behind borders of allocated block.

  2. No

  3. I know you are just asking and trying, but you should allocate memory typically.

    std::uint32_t* const p32 = std::calloc(array_count, sizeof(std::uint32_t));

    std::uint16_t* const p16 = std::calloc(array_count, sizeof(std::uint16_t));

kocica
  • 6,412
  • 2
  • 14
  • 35
  • I don't follow your answer to point 2. Why should he touch 2B behind allocated block? And calling every use of pointers unsafe is kindof overkill. Also, coalescing allocations can be an important aspect of writing efficient code, as allocation is expensive. – Deduplicator Aug 10 '17 at 15:23
  • 1
    No, he said swap the order of the two sub-blocks. – Deduplicator Aug 10 '17 at 15:26
0

Is this program unsafe?

In your example, I think there is no problem with exceptions at all, but I suppose you intend to use it in another context. If your code throws an exception, you probably are going to leak some memory.

Do you really need to use calloc/free? In the cppcore guidelines you can find some guidelines about resource management in C++: https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#S-resource

For instance, R10 and R11 say "not to call explicitly malloc/calloc/free, new/delete": https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rr-mallocfree

Only if you are doing something that really needs very low level code, you need to call new and delete explicitly, but the C++ way is to encapsulate all this calls inside a class, so the user of the class doesn't need to call explicitly new and delete.

But if you are not doing some low level stuff, have you consider to use std::array<>? or std::vector?

Antonio
  • 579
  • 1
  • 3
  • 12