9
int* a = new int[5] - 1;

This line by itself invokes undefined behavior according to the C++ standard because a is an invalid pointer and not one-past-the-end. At the same time this is a zero overhead way of making a 1-based array (first element is a[1]) which I need for a project of mine.

I'm wondering if this is something that I need to avoid or if the C++ standard is just being conservative to support some bizarre architectures that my code is never going to run on anyway. So the question is, on what architectures will this be a problem? Are any of those widespread?

Edit: To see that the line above does indeed invoke undefined behavior, take a look at this question.

Edit: Dennis Zickefoose points out that compilers are allowed to do anything when undefined behavior is invoked, so both the compiler and the CPU have to offer guarantees beyond the C++ standard for code like this to work. I'm expanding the question to whether any modern C++ compilers have this issue.

Community
  • 1
  • 1
Bjarke H. Roune
  • 3,667
  • 2
  • 22
  • 26
  • 4
    Calculating is never unsafe. Dereferencing can be. – Ignacio Vazquez-Abrams Jul 03 '11 at 01:45
  • 6
    @Ignacio Vazquez-Abrams Not true. It is for example allowed for the CPU to have special pointer registers that will issue errors if you load certain invalid pointer values into them. – Bjarke H. Roune Jul 03 '11 at 01:50
  • Ignacio's comment should be posted as an answer and accepted. – David Grayson Jul 03 '11 at 01:50
  • Bjarke: If you tell us what architectures you are talking about then that would make a good answer to the question. – David Grayson Jul 03 '11 at 01:51
  • You already took care of the "is never going to run on anyway" clause. Nobody can really guess what the next 30 years have in store. Obligatory xkcd link: http://xkcd.com/292/ – Hans Passant Jul 03 '11 at 01:52
  • I only know that this can be a problem somewhere because I've heard it as a rationale for having the above code invoke undefined behavior. I don't know on what systems it occurs and I don't know how widespread those systems are. That is why I'm asking. :) – Bjarke H. Roune Jul 03 '11 at 01:55
  • @Mitch Wheat I'm talking about is any issue that would make the code line I posted blow up due to calculating an invalid pointer. The only reason for that happening that I know of is pointer registers that object to certain values, but for all I know there could be other reasons too. – Bjarke H. Roune Jul 03 '11 at 02:03
  • 2
    Technically, as undefined behavior, even if the hardware won't error, the compiler is allowed to generate incorrect code if it notices you doing it. And some compilers do consider undefined behavior in their analysis for optimization purposes. Given your specific case, I'm not sure that's possible [`new T[5] - 1` could very well be a previously allocated `T` object, in which case you're okay], but in other cases it could blow up on you that way without hardware support. – Dennis Zickefoose Jul 03 '11 at 04:15
  • I would profile your heap with 0-based and your 1-based array scheme and see if there's any measurable difference. If there isn't (and surely there isn't, since memory-access will likely dominate), then the whole question of depending on undefined behavior isn't even worth considering. –  Jul 03 '11 at 09:33
  • @Ignacio @David: Please see my answer to this question for the relevant section of the standard. This behavior really is undefined. – Nemo Jul 04 '11 at 03:18

6 Answers6

5

Either way, a well-defined, zero-overhead way of creating a one-based array is the following:

int* a = new int[6];

There, problem solved. ;-) (But interesting question still.)

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 2
    Zero-overhead? I think not. You've allocated **more** memory than really needed. – valdo Jul 03 '11 at 08:45
  • @valdo One element. So what? Bjarke’s answer also needs to compute one additional address once, so from your perspective is that also not zero-overhead? – Konrad Rudolph Jul 03 '11 at 08:47
  • Of course. Moreover, it depends on the exact problem definition: do you need to minimize your memory utilization or CPU cycles, and what do you mean by the *overhead*. But before knowing this you **may not** claim that waste of the memory is *zero overhead*. Plus, this does not the question that the author asked (what are the exact architectures...). – valdo Jul 03 '11 at 13:03
  • 1
    Wouldn't that be a -1 based array? The first element is at a[-1]. If you leave out the second line, then you've got a 5 element 1-based array if you just don't use the first entry. I think that's what you meant - it is a good work-around. I actually asked this question because I wanted to avoid doing that, but it seems to be the better way. – Bjarke H. Roune Jul 04 '11 at 05:07
4

The hardware for doing the checks is present in all x86 processors, we are just not using it at the moment in the most popular operating systems.

If you use a segmented memory architecture, which we did for 16-bit systems, an allocation is not unlikely to return the address segment:0. In that case you just cannot subtract anything from that address!

Here is a starting point for reading about segmented memory and why loading an invalid segment is not possible:

http://en.wikipedia.org/wiki/Segment_descriptor

You have to decide if this unlikely to happen for your code, or if you perhaps can define an overloaded operator[] that handles the offset for you.

Bo Persson
  • 90,663
  • 31
  • 146
  • 203
  • Not quite agree with this. IMHO there's a chance that subtraction from `segment:0` will work consistently with the consequent additions. I.e. the offset will wrap-around in both subtraction and addition. Plus I've never heard of checking the pointer validity implicitly (without dereferencing or other special instructions like `probe`) on x86. – valdo Jul 03 '11 at 08:43
  • It's just an example, but from current hardware. If I had brought up the 68000 someone would have said that they are not used anymore. The language says that the implementation *can* test the validity, and the hardware to do it is present. If we have *slightly undefined behaviour* :-) it can become *implementation defined*, and then it is ok. If not, at least there is a potential portability problem. – Bo Persson Jul 03 '11 at 09:16
  • Well that settles it, if the range of CPUs that *could* do this includes "all x86 CPUs" then I'm just going to have to use an ordinary 0-based array and not use the first entry. Thanks. – Bjarke H. Roune Jul 04 '11 at 05:01
  • @valdo : the subtraction results in `segment-1:14` (`int` is 2 bytes on 16 bit systems). A later addition will result in `segment-1:16`, which would be the same linear address. However, the problem is that `segment-1` might not be a valid segment. – MSalters Jul 04 '11 at 11:35
3

Note: I am not going to answer your question. But judging from the comments, several experienced SO members do not seem to know that this behavior actually is undefined... So somewhere in here we ought to quote chapter and verse in some standard.

So...

From the C++0x standard (draft), section 5.7 (5), discussing the behavior of pointer addition (emphasis mine):

When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integral expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i + n-th and i − n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

Similar language appears in every version of the C and C++ standards. Pointer arithmetic producing a result outside of the bounds of the array (plus one) is undefined behavior, even if you never dereference the pointer, and it always has been.

...

Now, here is my own response to your question: You are asking the wrong question. Even if this never creates a problem on any current machine and compiler, you should be worried about future machines and compilers, because 90% of the cost of all software is maintenance. By coding rigidly to the spec, you guarantee that your code will work on any system that complies with the standard, whether today or 50 years from now. The risk of introducing subtle bugs for someone trying to maintain your code in the future is almost never outweighed by any benefit today.

Also, I am skeptical of the real-world performance difference. (I am not saying you are wrong; I am just saying I am skeptical.) While I admire your attempt to create "the world's fastest binary heap", I suspect you are underestimating the effects of the memory hierarchy. Sure, I believe that "multiply by two" is twice as fast as "multiply by two and add one". But I also believe that fetching a cache line from main memory is hundreds of times slower than either.

If you benchmark your heap by performing a small set of operations on it over and over, without doing anything else, you are probably operating entirely out of the L1 cache. But that is completely unrealistic. In real life, a heap is just a small piece of a large algorithm; the odds that the whole thing will always be sitting there in the L1 cache is very low unless that algorithm is totally trivial. So the cost of memory access is likely to dominate in any real-world scenario.

Nemo
  • 70,042
  • 10
  • 116
  • 153
  • Thanks for nailing down that it really is undefined. I'm doing this for the application of polynomial division, so I'm not working off a micro benchmark. Like you I don't yet know how big a difference this technique makes to my application so I'm implementing it to find out. My main goal is to figure out how well suited heaps are for polynomial division. I can't give an informed answer to that question without a good heap implementation, and that requires me to understand what a good heap implementation is. I would rather just learn that from an article on the topic, but I haven't found one. – Bjarke H. Roune Jul 04 '11 at 04:52
1

I think this could possibly be unsafe on some old 16 bit x86 systems. The address is "split" between an address register and a segment register and I would guess that this could result in an invalid value being loaded into a segment register which would cause an exception.

Probably not an issue as it's not a common architecture these days.

jcoder
  • 29,554
  • 19
  • 87
  • 130
0

People have already mentioned (a) strict language lawyer interpretaions of the standard, and (b) x86 segments, usually in 16 bit code (but also on some older 32bit OSes).

Let me mention another example, close to my heart:

Andy Glew. 1990. An empirical investigation of OR indexing. SIGMETRICS Perform. Eval. Rev. 17, 2 (January 1990), 41-49. DOI=10.1145/378893.378896 http://doi.acm.org/10.1145/378893.378896

That's me.

In this paper, I proposed an instruction set that used OR rather than ADD as its memory addressing mode.

Now, since C programmers can always do

int* a = new int[5] + 1;

compilers must handle this sort of thing correctly.

But they may be less efficient when doing so.

It turns out that I am not the only one who thought of this. Some real shipping computers have used this technique, although I do not have references at hand.

Anyway, that's just an example.


Overall, though

a) what you suggest will work on most machines (of course, most machines are x86s running Windows or Linux, or ARMs running UNIX derivatives like iOS and Android).

b) but it is arguably illegal. It may break some compiler optimizations, be broken by some compiler optimizations, etc.

By the way, on x86 1-based arrays cost little more to code and almost nothing in machine code. If you say something like

template<typename T,uint size>
class One_Based_Array {
private:   T array[size];
public:    T& operator[](uint i) { return array[i-1]; }
};

used like

One_Based_Array<int,100> a;
int tmp = a[i];

the machine code will look something like

MOV EAX, a.array-4(ebx)

i.e. 1-based stuff can usually be folded into the x86's basereg+indexreg+offset addressing mode. On some machine this costs nothing, usually, although the code may be a bit larger.

In fact, Intel's compilers often emit code that looks like

ebx = -i
MOV EAX, address_of_end_of_array[ebx]

i.e. using subtractive indexing rather than additive, or adding in a negative number, because testing against 0 is more efficient than testing against +N.

Krazy Glew
  • 7,210
  • 2
  • 49
  • 62
0

You asked several questions here. One was "is this something I need to avoid". My answer to that is yes. You are creating a pointer that points to memory you don't own. You should avoid that. Using a[0], a perfectly legal statement results in undefined behavior. What does your corresponding delete[] look like?

The other question depends on what you consider bizarre. Does Hardware with dedicated registers for pointers that are checked for validity qualify as bizarre? The real question seems to be do you want portable code? If so then avoid this. If not, then you still might want to avoid this. Using this depends on your attitude towards code maintenance because this strikes me as something that can easily cause the person who inherits your code grief.

Tod
  • 8,192
  • 5
  • 52
  • 93
  • 3
    I think the real question is what the poster asked: what existing architectures do the sort of pointer-validation that would cause errors even if the invalid pointers are never dereferenced? Are there any? (whether having such pointers is a good idea or not is a different question) – Jeremy Friesner Jul 03 '11 at 05:16
  • Both the array and its indexes are encapsulated in a class for implicit binary trees. Clients of that class only see opaque index tokens that have a private index field, so it's not possible to ask for a[0] from outside the class. If client code somehow managed to construct an index token with index 0 inside it, that would result in an assertion error on use of that token. Someone who would modify the class itself without understanding that the array is 1-based is going to make many other errors than just trying to access a[0]. This is all off-topic from the question being asked though. – Bjarke H. Roune Jul 03 '11 at 07:18
  • @Jeremy, I understand you feel the "real" question was the title of the post, but other questions **were** in the body. I should have worded my response differently, "Need to avoid" sounds too much like an absolute. I just wanted to point out that portability was being sacrificed and that maintainability difficulity will increase (for a future maintainer that isn't Bjarke). Bjarke appears to have considered these issues and finds the trade-offs worthwhile, good enough for me. Sorry everyone feels I wasted bandwidth, but SO questions can live long and I felt the point was worth making. – Tod Jul 05 '11 at 05:06
  • @Tod I don't actually think about it as a trade-off in that way. My desire to encapsulate the increased complexity of using the array led me to split my heap code into separate heap and "complete binary tree" classes. I consider the net effect to be an increase in code quality - I was even able to reuse the tree class elsewhere in my project. It doesn't always go that way, but I often find that the extra attention I pay to code that I'm optimizing causes the net code quality to improve. So my trade-off is often time versus performance+quality rather than quality versus performance. – Bjarke H. Roune Jul 07 '11 at 01:59