0

This might be a very basic question that is already asked but I was not quite sure if the answer here Casting an int pointer to a char ptr and vice versa is applicable in my case. So essentially I have something as follows:

void* head = sbrk(1024);  //allocate 1024 bytes in heap   

*((int*)(head+size)) = value;   //value and size are int with valoues between 1 and 1023

I would like to know if for an arbitrary value of size the above does not work then what are the restrictions on the value of size? Does it have to be divisible by 4?

Community
  • 1
  • 1
as3rdaccount
  • 3,711
  • 12
  • 42
  • 62
  • 1
    I'm shocked this even compiles. Pointer arithmetic requires a *typed* pointer to determine size differentiation when adjusting the address. GNU has an extension to allow it, so I'm sure others do too, but I'm still taken back by it. Even with the extensions allowing it to compile, you may well hit alignment faults on non x86 platforms if size is *not* a multiple of `sizeof(int)`. I can't speak for the GNU extension directly, but chances are it performs `void*` arithmetic by treating the pointer as a `char*`. – WhozCraig Mar 25 '13 at 07:17
  • Messing with sbrk() to allocate memory is deprecated for a long time in favor of malloc() and friends. Why are you using sbrk? – Jens Mar 25 '13 at 14:26

3 Answers3

2

First of all, you can't do pointer arithmetic on void pointers. That code should not even compile.

For the sake of discussion, let us assume that you have a char pointer instead. Then formally, such casts followed by an access is undefined behavior. In the real world however, your code will always work if you can manually ensure alignment. You will have to ensure that the address where you write is at an aligned memory position, or there are no guarantees that the code will work.


EDIT with relevant quotes from the ISO 9899:2011 standard why pointer arithmetic on a void pointer is undefined behavior:

6.3.2.2 void

The (nonexistent) value of a void expression (an expression that has type void) shall not be used in any way, and implicit or explicit conversions (except to void) shall not be applied to such an expression.

.

6.5.6 Additive operators

/--/

For addition, either both operands shall have arithmetic type, or one operand shall be a pointer to a complete object type and the other shall have integer type. (Incrementing is equivalent to adding 1.)

.

4 Conformance

If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a constraint or runtimeconstraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words ‘‘undefined behavior’’ or by the omission of any explicit definition of behavior. There is no difference in emphasis among these three; they all describe ‘‘behavior that is undefined’’.

Whether code violating normative text in the standard "should compile" or not can certainly be debated, but I don't think that discussion is of benefit to the OP. Simply don't write code relying on undefined behavior, ever.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Is “That code should not even compile” based on a rule in the C standard, or is it your expression of what “good” behavior for a compiler is? – Eric Postpischil Mar 25 '13 at 11:36
  • @Eric: C11 draft: "For addition, either both operands shall have arithmetic type, or one operand shall be a pointer to a *complete object type* and the other shall have integer type." (6.5.6/2) defines the types of pointers that can be added to, and "The void type comprises an empty set of values; it is an *incomplete object type that cannot be completed*." (6.2.5/19) defines, and in doing so disqualifies, `void`. – cHao Mar 25 '13 at 12:46
  • @CHao: Those statements define what the programmer must do to create a conforming program and what the compiler must accept. They do not specify what the compiler must do when the requirements are not met. That is, there is no requirement that the compiler must refuse to compile a program containing arithmetic with a pointer to void. – Eric Postpischil Mar 25 '13 at 13:08
  • @EricPostpischil The chapter 6.5.6/2 quoted by @cHao is relevant, it is normative text where we can read: `For addition, either both operands shall have arithmetic type, or one operand shall be a pointer to a complete object type and the other shall have integer type. (Incrementing is equivalent to adding 1.)`. A violation of "shall" in normative text is undefined behavior per clause 4 of the standard. And as usual, it is not really constructive to debate what happens when we get UB, especially not when it is completely out of scope from the C language. Just keep the program clear of UB. – Lundin Mar 25 '13 at 13:49
  • The reason I brought it up is that you appeared to be specifying what happens when the behavior is undefined. You were stating that the program should not compile. – Eric Postpischil Mar 25 '13 at 14:17
  • @EricPostpischil Yes, since that's usually how compilers implement language violations. Try compiling `int main(){ DONALD DUCK }` on any C compiler. Daring as I am, I assume that all of the existing compilers on the market will give you a compiler error. Though of course GCC probably has an extension that translates DONALD DUCK into `while(fork());`. That's the risk you have to take when keeping GCC extensions enabled. – Lundin Mar 25 '13 at 14:26
  • (For the record, `gcc -std=c99 -pedantic-errors` will not allow void pointer arithmetics to slip through.) – Lundin Mar 25 '13 at 14:28
  • Cause i'm bored: `#define DONALD while` and `#define DUCK (fork());` – cHao Mar 25 '13 at 18:35
1

Use memcpy():

memcpy((char*)head + size, &value, sizeof(value));
Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
  • It is the very same thing as the code already in place, save for the necessary cast to `char*`. There is no obvious benefit of using memcpy instead. – Lundin Mar 25 '13 at 07:44
  • @Lundin How about alignment? – Alexey Frunze Mar 25 '13 at 07:46
  • What about it? memcpy does what it is told, it has no built-in intelligence. If you give it a misaligned address, it will blindly accept it and attempt to write to that address. – Lundin Mar 25 '13 at 09:32
  • @Lundin But `memcpy()` will succeed copying from/to misaligned location, whereas `*(int*)misaligned_pointer` will fail on some architectures (e.g. MIPS, ARM). – Alexey Frunze Mar 25 '13 at 09:42
  • Other answers were good and onformative but memcpy solved the problem of alignment that I think I was having in a i386 machine. – as3rdaccount Mar 29 '13 at 04:40
  • That's odd. x86 CPUs don't care much about alignment (except when performance or multiprocessor synchronization is at stake). It could be an issue with aliasing, though, which is more of a C than a CPU thing. – Alexey Frunze Mar 29 '13 at 06:09
0

On many systems, in this circumstance, it is required that size be a multiple of four (subject to additional conditions detailed below, including that the size of int be four bytes on your system). On systems that do not require this, it is usually preferred.

First, the type of head is void *, and the C standard does not define what happens when you do pointer arithmetic with void *.

Some compilers, notably GCC and its heirs, will treat this arithmetic as if the type were char *. I will proceed on this basis.

Second, I am not aware of a guarantee that sbrk returns an address with any particular alignment.

Let us suppose that sbrk does return a well-aligned address, and that your C implementation does the plain thing to evaluate * (int *) (head + size) = value, which is to issue a store instruction to write the value of value (converted to an int) to the address head + size.

Then your question becomes: What does my computing platform do with an int store to this address?

As long as head + size is an address suitably aligned for int on your platform, the store will execute as expected. On most platforms, four-byte integers prefer four-byte alignment, and eight-byte integers prefer eight-byte alignment. As long as head is aligned to a multiple of this preference and size is a multiple of this preference, then the store will execute normally.

Otherwise, what happens depends on your platform. On some platforms, the hardware executes the store but may do it more slowly than normal store instructions, because it breaks it into two separate writes to memory. (This also means that other processes sharing the same memory might be able to read memory while one part of the value has been stored but the other part has not. Again, this depends on the characteristics of your computing platform.)

On some platforms, the hardware signals an exception that interrupts program execution and transfers control to the operating system. Some operating systems fix up misaligned stores by analyzing the failing instruction and executing alternate instructions that perform the intended store (or the operating system relays the exception to special code in your program, possibly in automatically included libraries, that do this fix-up work). On these platforms, misaligned stores will be very slow; they can hugely degrade the performance of a program.

On some platforms, the hardware signals an exception, and the operating system does not fix up the misaligned store. Instead, the operating system either terminates your process or sends it a signal about the problem, which often results in your process terminating. (Other possibilities include triggering a debugger or entering special code you have included in your program to handle signals.)

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • _"Some compilers, notably GCC and its heirs, will treat this arithmetic as if the type were char *"_ And other compilers may print a pink elephant on the screen. One should never rely on this, since it is undefined behavior. The OP seems interested in a general C programming case, rather than how a particular undefined behavior manifests itself on a certain compiler. – Lundin Mar 25 '13 at 14:10
  • @Lundin: First, my answer does not assert that a person should rely on this. Second, this is not undefined behavior. It is not defined by the C standard. But it **is defined** by GCC and other compilers. It is an [explicit part of the documentation](http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Pointer-Arith.html). Where do people get this notion that if the C standard does not define something, it is undefined? Nonsense. The question is tagged Unix and uses `sbrk`, so it is clearly using more than just the C standard. I stated the premise clearly. – Eric Postpischil Mar 25 '13 at 14:23
  • Undefined behavior == not defined by the standard. It's the same thing. And this is undefined behavior since it violates the normative text at 6.5.6. All violations of normative text is undefined behavior. See quotes from the standard added to my answer. – Lundin Mar 25 '13 at 14:31
  • @Lundin: If you want to use “undefined behavior” to mean not defined by the C standard, then your comment is irrelevant to this answer. Neither the question nor my answer are based solely on the C standard. I do not understand why people try to drag every question involving C back to the C standard **only**. We use many, many documents in software engineering: Hardware manuals, compiler documentation, operating system documentation. When somebody asks about C and Unix combined and shows indications of a language extension, it makes no sense to restrict the answer to C only. – Eric Postpischil Mar 25 '13 at 14:43
  • @EricPostpischil: It quite explicitly *does* make sense to restrict the *language* of the answer to C only, as the question is about C rather than GCC's superset thereof. GCC is not the only compiler in use on *nix systems (esp strictly Unix ones), and the "unix" tag does not automagically mean that all of GCC's nonstandard crap can be thrown in. The libraries can include stuff that's not in the standard (particularly POSIX stuff), but the language itself should be standards-conforming. – cHao Mar 25 '13 at 16:33
  • @cHao: Your statement about GCC being not the only compiler in use on Unix is a non sequitur as nothing in my answer or comments makes such an assumption. As my previous comment notes, the question contained indications of a language extension, so it was reasonable to figure that language extension was in use. Furthermore, I explicitly stated that requirement in the answer. – Eric Postpischil Mar 25 '13 at 18:01
  • @EricPostpischil: The question was written by someone who's confused. The assumption that the code is using GCCisms at all, let alone intentionally, is flawed. – cHao Mar 25 '13 at 18:17
  • @cHao: No, the question is not about syntax or about GCC extensions, it is about alignment. – Eric Postpischil Mar 25 '13 at 18:19