1

I have a data array with 2*N ints, representing pairs, that is for even i=0,2,4,...,2*N (pairs[i], pairs[i+1]) is such a pair. The data is formatted like this because I use Matlab's mex library. I do:

int N=5;
int data[10] = {1,2,3,4,5,6,7,8,9,10};
struct Pair { int first; int second; };
Pair * pairs = (Pair *)data;

but the problem would be that there is no way to guarantee that Pair aligns as two sizeof(ints) in order first, second. See: Is the member field order of a class "stable"?

I don't want to process and copy all data into a new array, since it should not be necessary, and I need (as far as I can see) to use

typedef int Pair[2];

to be sure that it aligns correctly (no trailing garbage bytes, etc). if I then want to sort the pairs according to the first element, I could do:

#include <iostream>
#include <algorithm>

typedef int Pair[2];

int compare(Pair n1, Pair n2) { return n1[0] < n2[0]; }

int main() {
    int N=5;
    int data[10] = {1,2, 7,8, 13,14, 4,5, 10,11};
    Pair *pairs  = (Pair *)((void *)data); 

    std::cout << "unsorted" << std::endl;
    for(int i=0; i<N;++i) std::cout << i << ": (" << pairs[i][0] << ", " << pairs[i][1] << ")" << std::endl;

    std::sort(data, data+N, compare);

    std::cout << "sorted" << std::endl;
    for(int i=0; i<N;++i) std::cout << i << ": (" << pairs[i][0] << ", " << pairs[i][1] << ")" << std::endl;

    return 0;
}

see: http://ideone.com/VyBUvc

I could summarize the error message as error: array must be initialized with a brace-enclosed initializer, see below for the complete message. It is caused by the std::sort call.

I wrapped the Pair typedef in a union here ( http://ideone.com/TVmEeZ ), and that seems to work. Why does c++ (or std::sort) not see int[2] in a similar way as a union?

Complete compiler output:

In file included from /usr/include/c++/4.8/bits/stl_pair.h:59:0,
                                from /usr/include/c++/4.8/bits/stl_algobase.h:64,
                                from /usr/include/c++/4.8/bits/char_traits.h:39,
                                from /usr/include/c++/4.8/ios:40,
                                from /usr/include/c++/4.8/ostream:38,
                                from /usr/include/c++/4.8/iostream:39,
                                from prog.cpp:1:
/usr/include/c++/4.8/bits/stl_algo.h: In instantiation of ‘void std::__insertion_sort(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = int (*)[2]; _Compare = bool (*)(int*, int*)]’:
/usr/include/c++/4.8/bits/stl_algo.h:2250:70:   required from ‘void std::__final_insertion_sort(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = int (*)[2]; _Compare = bool (*)(int*, int*)]’
/usr/include/c++/4.8/bits/stl_algo.h:5514:55:   required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = int (*)[2]; _Compare = bool (*)(int*, int*)]’
prog.cpp:16:35:   required from here
/usr/include/c++/4.8/bits/stl_algo.h:2186:11: error: array must be initialized with a brace-enclosed initializer
    __val = _GLIBCXX_MOVE(*__i);
                    ^
In file included from /usr/include/c++/4.8/algorithm:62:0,
                                from prog.cpp:2:
/usr/include/c++/4.8/bits/stl_algo.h:2188:17: error: invalid array assignment
                *__first = _GLIBCXX_MOVE(__val);
                                ^
/usr/include/c++/4.8/bits/stl_algo.h: In instantiation of ‘_RandomAccessIterator std::__unguarded_partition(_RandomAccessIterator, _RandomAccessIterator, const _Tp&, _Compare) [with _RandomAccessIterator = int (*)[2]; _Tp = int [2]; _Compare = bool (*)(int*, int*)]’:
/usr/include/c++/4.8/bits/stl_algo.h:2319:78:   required from ‘_RandomAccessIterator std::__unguarded_partition_pivot(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = int (*)[2]; _Compare = bool (*)(int*, int*)]’
/usr/include/c++/4.8/bits/stl_algo.h:2360:62:   required from ‘void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = int (*)[2]; _Size = int; _Compare = bool (*)(int*, int*)]’
/usr/include/c++/4.8/bits/stl_algo.h:5513:44:   required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = int (*)[2]; _Compare = bool (*)(int*, int*)]’
prog.cpp:16:35:   required from here
/usr/include/c++/4.8/bits/stl_algo.h:2287:35: error: invalid conversion from ‘const int*’ to ‘int*’ [-fpermissive]
        while (__comp(*__first, __pivot))
                                                                    ^
/usr/include/c++/4.8/bits/stl_algo.h:2290:34: error: invalid conversion from ‘const int*’ to ‘int*’ [-fpermissive]
        while (__comp(__pivot, *__last))
                                                                    ^
In file included from /usr/include/c++/4.8/bits/stl_pair.h:59:0,
                                from /usr/include/c++/4.8/bits/stl_algobase.h:64,
                                from /usr/include/c++/4.8/bits/char_traits.h:39,
                                from /usr/include/c++/4.8/ios:40,
                                from /usr/include/c++/4.8/ostream:38,
                                from /usr/include/c++/4.8/iostream:39,
                                from prog.cpp:1:
/usr/include/c++/4.8/bits/stl_heap.h: In instantiation of ‘void std::make_heap(_RAIter, _RAIter, _Compare) [with _RAIter = int (*)[2]; _Compare = bool (*)(int*, int*)]’:
/usr/include/c++/4.8/bits/stl_algo.h:1970:47:   required from ‘void std::__heap_select(_RandomAccessIterator, _RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = int (*)[2]; _Compare = bool (*)(int*, int*)]’
/usr/include/c++/4.8/bits/stl_algo.h:5363:59:   required from ‘void std::partial_sort(_RAIter, _RAIter, _RAIter, _Compare) [with _RAIter = int (*)[2]; _Compare = bool (*)(int*, int*)]’
/usr/include/c++/4.8/bits/stl_algo.h:2355:68:   required from ‘void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = int (*)[2]; _Size = int; _Compare = bool (*)(int*, int*)]’
/usr/include/c++/4.8/bits/stl_algo.h:5513:44:   required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = int (*)[2]; _Compare = bool (*)(int*, int*)]’
prog.cpp:16:35:   required from here
/usr/include/c++/4.8/bits/stl_heap.h:446:25: error: array must be initialized with a brace-enclosed initializer
        _ValueType __value = _GLIBCXX_MOVE(*(__first + __parent));
                                                ^
/usr/include/c++/4.8/bits/stl_heap.h: In instantiation of ‘void std::__pop_heap(_RandomAccessIterator, _RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = int (*)[2]; _Compare = bool (*)(int*, int*)]’:
/usr/include/c++/4.8/bits/stl_algo.h:1973:50:   required from ‘void std::__heap_select(_RandomAccessIterator, _RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = int (*)[2]; _Compare = bool (*)(int*, int*)]’
/usr/include/c++/4.8/bits/stl_algo.h:5363:59:   required from ‘void std::partial_sort(_RAIter, _RAIter, _RAIter, _Compare) [with _RAIter = int (*)[2]; _Compare = bool (*)(int*, int*)]’
/usr/include/c++/4.8/bits/stl_algo.h:2355:68:   required from ‘void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = int (*)[2]; _Size = int; _Compare = bool (*)(int*, int*)]’
/usr/include/c++/4.8/bits/stl_algo.h:5513:44:   required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = int (*)[2]; _Compare = bool (*)(int*, int*)]’
prog.cpp:16:35:   required from here
/usr/include/c++/4.8/bits/stl_heap.h:339:28: error: array must be initialized with a brace-enclosed initializer
            _ValueType __value = _GLIBCXX_MOVE(*__result);
                                                        ^
In file included from /usr/include/c++/4.8/bits/stl_algo.h:61:0,
                                from /usr/include/c++/4.8/algorithm:62,
                                from prog.cpp:2:
/usr/include/c++/4.8/bits/stl_heap.h:340:17: error: invalid array assignment
            *__result = _GLIBCXX_MOVE(*__first);
                                ^
Community
  • 1
  • 1
Herbert
  • 5,279
  • 5
  • 44
  • 69
  • Logically, you should be able to do this using a facade pattern for either the iterator or the vector. Practically, however, the standard seems to still impose enough restrictions that you'll end up with undefined behavior (even though it will almost certainly work). – James Kanze May 06 '14 at 10:37
  • `struct Pair { int first; int second; };` is guaranteed to have items in that order and no initial padding, IDK why you think otherwise. To check that there is no padding, test that `sizeof(Pair) == 2*sizeof(int);`, which should be true on any sane implementation. (If it's not then you can cross that bridge when you come to it). – M.M May 06 '14 at 10:48
  • @MattMcNabb, No, not really. As far as I know, compilers typically do align. For example struct{char, int} would typically allocate 4 bytes for the char, since RAM memory can be transported to CPU memory in modulo-4 byte addresses on most machines. Members are thus aligned to me at modulo-4 addresses, to prevent the need of two RAM-CPU operations to transport one int. Of course, ints are typically 4 bytes and hence chances are that 2 ints are aligned in a sensible way. Still, there is no guarantee :) – Herbert May 06 '14 at 10:55
  • @Herbert we are talking about two ints, not a char and an int. I'm saying that you can do a `sizeof` check to see if your compiler inserted padding or not (and other tests to check that your int array is correctly aligned for Pair, I guess) and if it didn't then your code will work. – M.M May 06 '14 at 10:58
  • And if the checks do find misalignment, what should I do in that branch? – Herbert May 07 '14 at 18:54

2 Answers2

1
std::sort(data, data+N, compare);

You are sorting data, not pairs. That said, your new approach is still undefined behaviour, and thus not guaranteed to work1. You are essentially trying to fit a square peg into a round hole. If you want to use std::sort, present valid data – which means copying in your case, or writing a custom iterator which treats an array as a collection of consecutive pairs.


1 That’s a humungous understatement. – Do not do this.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 1
    @Herbert Because it’s simply invalid C++. The cast is illegal, it violates the [strict aliasing rule](http://stackoverflow.com/q/98650/1968). – Konrad Rudolph May 06 '14 at 10:13
  • Sorry, where does it do that? Typedefs do not introduce new types; all access is still via expressions of type `int`. And you may cast from any type to `void*` and back. – MSalters May 06 '14 at 10:38
  • @MSalters Are `int[2]` and `int[10]` not distinct types for the sake of strict aliasing? I had assumed so, although I admit that it would make sense if that weren’t the case. – Konrad Rudolph May 06 '14 at 10:41
  • @MSalters If it's not undefined behavior, it certainly doesn't do what the OP requested. If `sort` only used `iter_swap`, you could design an iterator which would do the trick, but I don't think that this is guaranteed. – James Kanze May 06 '14 at 10:43
  • @KonradRudolph: You "alias" an `int[10]` with a `int[1]` all the time. – MSalters May 06 '14 at 10:50
  • @JamesKanze: I agree, the problem I see is that there's no `std::swap`. – MSalters May 06 '14 at 10:51
  • @MSalters Do I? Where? Are you talking about array to pointer decay? Because that’s very different … – Konrad Rudolph May 06 '14 at 11:02
  • The idea was that int[2] aligns well with int[2*N] and that union has to have the size of it's largest member, hence not allowing it to do alignment magic. But I'm not sure whether sizeof(struct) == max_{m<-members} sizeof(m) . – Herbert May 06 '14 at 11:06
  • @Herbert I still think that having a custom iterator is the way to go. It’s valid, and it won’t incur any unnecessary copies if written carefully. Unfortunately, implementing such a random iterator is also quite a bit of work. C++ does not help you one bit here. :-( – Konrad Rudolph May 06 '14 at 11:35
  • @MSalters The problem I see is that `std::sort` will not even access the final elements of `data`; and the fact that it will do things like `compare( data, data + 1 )`, which results in overlapping `pair` may cause problems as well. A more interesting possibility would be to use a facade iterator, with a value type of `std::tuple`. But the way the standard is written, facade iterators like that will result in undefined behavior. – James Kanze May 06 '14 at 12:17
  • @KonradRudolph It especially doesn't help in that it requires `*iter` to return an actual reference to the element in the sequence. – James Kanze May 06 '14 at 12:18
  • @James Where does UB come in here? `facade a{data}, b{a + size}; sort(a, b);` should work, you just need to provide a proxy `facade::value_type` which transparently refers to consecutive elements in the underlying array. – Konrad Rudolph May 06 '14 at 13:09
  • @KonradRudolph I wish it were. The standard says that the functions which return a reference must refer to an element in the underlying sequence. An algorithm which does `&*iter`, and stores the results in a raw pointer, is legal. (I can't imagine `std::sort` doing this, and I suspect that with a sufficiently astute `facade::value_type`, it would work; it may be that the results of `std::tie` is sufficiently astute, but I've not had time to verify it.) – James Kanze May 06 '14 at 14:46
  • @James This seems a “bug” in the standard to me: Why else would `iterator_traits<>::reference` exist? Do you have a link to some kind of explanation for this? – Konrad Rudolph May 06 '14 at 15:13
  • @KonradRudolph I would tend to agree with you. I'd actually like to see iterators allowed to use proxy types instead of the `value_type&` itself. And as for `iterator_traits<>::reference`: that's another can of worms; since you can't overload the `.` operator, you can't make a smart reference, like you can a smart pointer. – James Kanze May 06 '14 at 17:05
  • The question remaining, is a single-member union allowed to append garbage bytes or align its member strangely? – Herbert May 07 '14 at 13:10
-1

Exchanging your array of two int for a std::pair<int,int> did the trick for me (live at ideone):

#include <iostream>
#include <algorithm>
#include <memory>

typedef std::pair<int,int> Pair;

bool compare(const Pair& i, const Pair& j) { return i.first < j.first; }

int main() {
    const int N=5;
    Pair pairs[N] = {{1,2}, {7,8}, {13,14}, {4,5}, {10,11}};

    std::cout << "unsorted" << std::endl;
    for(int i=0; i<N;++i) std::cout << i << ": (" << pairs[i].first << ", " << pairs[i].second << ")" << std::endl;

    std::sort(pairs, pairs+N, compare);

    std::cout << "sorted" << std::endl;
    for(int i=0; i<N;++i) std::cout << i << ": (" << pairs[i].first << ", " << pairs[i].second << ")" << std::endl;
}

An alternative would be encapsulating the array of two int inside a struct. The problem in your code is that std::sort need an array of comparable (you fixed it with your compare function) and copy-or-move-assignable items (arrays are neither)

Maybe even better (less changes to your code) would be using std::array:

#include <iostream>
#include <algorithm>
#include <memory>

typedef std::array<int, 2> Pair;

bool compare(const Pair& i, const Pair& j) { return i[0] < j[0]; }

int main() {
    const int N=5;
    Pair pairs[N] = {1,2, 7,8, 13,14, 4,5, 10,11};

    std::cout << "unsorted" << std::endl;
    for(int i=0; i<N;++i) std::cout << i << ": (" << pairs[i][0] << ", " << pairs[i][1] << ")" << std::endl;

    std::sort(pairs, pairs+N, compare);

    std::cout << "sorted" << std::endl;
    for(int i=0; i<N;++i) std::cout << i << ": (" << pairs[i][0] << ", " << pairs[i][1] << ")" << std::endl;
}
Massa
  • 8,647
  • 2
  • 25
  • 26
  • Yes, that is the problem. std::array and std::pair do not guarantee to align the same way as a plain int array, since compilers are very liberated to align data as they see fit. for example we could have an array |0: 4b|1: 4b| ...| of ints allocated using 4 bytes and an array of pairs allocated like |0: 9b|1: 9b...|, since the compiler decides to introduce a garbage byte (often this speeds up RAM-CPU instructions). Hence both will not align ;), although 4byte is a very common word size and your suggestion *might* just work 99% of the time. – Herbert May 06 '14 at 11:00
  • I don't get it. What exactly is undefined behaviour in my solution? (I just threw away with the `data` array for two, well-defined, different solutions...) – Massa May 06 '14 at 11:40
  • and, some time later, I got that the OP did want to type-erase the `data` vector... – Massa May 06 '14 at 13:34
  • The problem is that there is no way of telling that the compiler correctly aligns (in memory) the std::pair's and std::array's with an integer-array. – Herbert May 07 '14 at 12:14
  • @Herbert I would bet it will more often than not, but if I had to do it, I would stap a big `/* THIS IS A DESPERATE OPTIMIZATION */` comment on it. :D – Massa May 07 '14 at 12:16
  • @Massa Agreed, what annoys me however is that C++ is a low level language where forced alignment should be allowed, at least, imho ;) – Herbert May 07 '14 at 13:08