1

I am going through the following link to understand memory alignment: https://www.ibm.com/developerworks/library/pa-dalign/#N10159. However I am not able to understand the code snippet given below.

void Munge8( void *data, uint32_t size ) {
    uint8_t *data8 = (uint8_t*) data;
    uint8_t *data8End = data8 + size;

    while( data8 != data8End ) {
        *data8++ = -*data8;
    }
}

Here the intent is to increment the pointer and that could have been done by "data8 = data8 + 1" but the code in question uses "*data8++ = -*data8". Both of them work fine i.e. increment the pointer but I am unable to understand the logic behind the later. Is it better than "data8 = data8+1"?

During compilation I get an error "alignment_test1.c:44: warning: operation on ‘data8’ may be undefined".

The second part of the question is regarding the code snippet below (from the same link mentioned earlier).

Listing 2. Munging data two bytes at a time

void Munge16( void *data, uint32_t size ) {
    uint16_t *data16 = (uint16_t*) data;
    uint16_t *data16End = data16 + (size >> 1); /* Divide size by 2. */
    uint8_t *data8 = (uint8_t*) data16End;
    uint8_t *data8End = data8 + (size & 0x00000001); /* Strip upper 31 bits. */

    while( data16 != data16End ) {
        *data16++ = -*data16;
    }
    while( data8 != data8End ) {
        *data8++ = -*data8;
    }
}

What could be the reason behind the second 'while' loop? Because data8 and data8End are always going to be same in this case.

melpomene
  • 84,125
  • 8
  • 85
  • 148
NeilB
  • 347
  • 2
  • 16
  • 2
    `*data8++ = -*data8;` is undefined behaviour. – mch Jan 15 '18 at 09:34
  • Possible duplicate of [Why are these constructs (using ++) undefined behavior?](https://stackoverflow.com/questions/949433/why-are-these-constructs-using-undefined-behavior) – mch Jan 15 '18 at 09:34
  • @mch Yes, during compilation I saw the warning. I am trying to understand how does the statement increment the pointer? – NeilB Jan 15 '18 at 09:38
  • 1
    `*data8++` is parsed as `*(data8++)`, so it dereferences `data8` and increment the pointer. `(*data8)++` would increment the value, where the pointer points to. – mch Jan 15 '18 at 09:46
  • @mch Thank you. Any clue about the second part? – NeilB Jan 15 '18 at 09:56
  • @NeilB Summary: the person who wrote the article is incompetent and should not be writing technical articles about C programming. – Lundin Jan 15 '18 at 10:32

1 Answers1

2

The code from that article is a buggy mess. Do not read or study it. Stop reading the article immediately.

*data8++ = -*data8; invokes undefined behavior, since it contains unsequenced access to data8. See the most common C FAQ of all time: Why are these constructs (using ++) undefined behavior?

Depending on the original type of the data, the second part is very likely to contain various forms of strict aliasing behavior, another form of undefined behavior. You cannot convert pointer-to-something to pointer-to-uint16_t and then access the content as if it was a uint16_t, unless that was the original type of the object (effective type). This leads us to another common C FAQ, What is the strict aliasing rule?

You can direct the author of that article to those links and tell them to go study basic C programming, before attempting to publish technical articles on the internet.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Since this article is 13 years old, it's more than likely that someone already did point that out to the author of that article. – CookiePLMonster Jan 15 '18 at 10:48
  • 1
    @CookiePLMonster So? It is still published on the internet. The behavior of sequence points and strict aliasing has not changed since 2005. – Lundin Jan 15 '18 at 10:50
  • Are you really sure that `*data8++=-*data8;` is UB? AFAIK the char lvalue `*data8` is accessed and negated before assignation, and the post incrementation is sequenced after the assignation, because the new value of `*data8` shall be available before the increment: *The value computation of the result is sequenced before the side effect of updating the stored value of the operand.* (from 6.5.2.4 Postfix increment and decrement operators) – Serge Ballesta Jan 15 '18 at 12:53
  • @SergeBallesta The code is so poorly written that it is hard to see (which alone is a very bad thing). The sub-expression `data8++` is a side effect of updating the pointer, which is unsequenced in relation to the value computation (of the pointer) `data8` on the right-hand side of the expression. So the undefined behavior is on the pointer variable itself, not on the pointed-at data. C11 6.5/2. But it is true that there aren't two side effects on the same variable here, which fooled me at first. But that's not the only criteria for UB. – Lundin Jan 15 '18 at 13:40
  • After reading twice, it is indeed UB, because 6.5.16 Assignment operator says: *The evaluations of the operands are unsequenced.* So the pointer incrementation could occur before evaluation of the right hand side. Note 84 of 6.5 Expressions says that `a[i++] = i` **is** UB probably for same reason. – Serge Ballesta Jan 15 '18 at 14:49
  • @SergeBallesta Yes, though notably it is not UB in C++11 and beyond, since the assignment operator has well-defined order of evaluation there. – Lundin Jan 15 '18 at 14:54