1

I've been getting warnings from Lint (740 at http://www.gimpel.com/html/pub/msg.txt) to the effect that it warns me not to cast a pointer to a union to a pointer to an unsigned long. I knew I was casting incompatible types so I was using a reinterpret_cast and still I got the warning which surprised me.

Example:

// bar.h
void writeDWordsToHwRegister(unsigned long* ptr, unsigned long size)
{
  // write double word by double word to HW registers 
  ...
};

// foo.cpp
#include "bar.h"

struct fooB
{
  ...
}

union A 
{
  unsigned long dword1;
  struct fooB; // Each translation unit has unique content in the union
  ...
}

foo()
{
  A a;
  a = ...; // Set value of a

  // Lint warning
  writeDWordsToHwRegister(reinterpret_cast<unsigned long*> (&a), sizeof(A));

  // My current triage, but a bad one since someone, like me, in a future refactoring 
  // might redefine union A to include a dword0 variable in the beginning and forget
  // to change below statement.      
  writeDWordsToHwRegister(reinterpret_cast<unsigned long*> (&(a.dword1)), sizeof(A)); 
}

Leaving aside exactly why I was doing it and how to solve it in the best way (void* in interface and cast to unsigned long* in writeDWordsToHwRegister?), reading the Lint warning explained that on some machines there was a difference between pointer to char and pointer to word. Could someone explain how that difference could manifest itself and maybe give examples on some processors that shows these differences? Are we talking alignment issues?

Since its an embedded system we do use exotic and in house cores so if bad things can happen, they probably will.

cafce25
  • 15,907
  • 4
  • 25
  • 31
tombjo
  • 191
  • 8

5 Answers5

3

Generally difference between pointers do refer to the fact that different types have different sizes and if you do a pointer+=1 you will get different results if p is a pointer to char or if it is a pointer to word.

Chefire
  • 139
  • 1
  • 7
  • What you are talking about is a difference in the pointer arithmetics during compilation. Nowhere am I doing pointer arithmetics so that should not be applicable, should it? – tombjo Jul 20 '12 at 10:46
  • Asfar as i know pointers do contain an address to a secvential memory. Since the SO will keep your programm thinking that it has access to a continues block of bytes of memory i can not think why an address to a char would be any different from an address to a dword or anything else. Anyway, this is partly what i think computers should be designed like and i might be wrong. – Chefire Jul 20 '12 at 18:41
3

The compiler assumes that pointers to As and pointers to longs (which are usually dwords, but might just be words in your case) do not point to the same area of memory. This makes a number of optimizations okay: For example, when writing to somewhere pointed to A*, prior loads from long* do not need to be updated. This is called aliasing - or in this case, the lack thereof. But in your case, it has the effect that the code produced might actually not work as expected.

To make this portable, you first have to copy your data through a char buffer, which has an exception to the anti-aliasing rule. chars alias with everything. So when seeing a char, the compiler has to assume it can point to anything. For example, you could do this:

char buffer[sizeof(A)];
// chars aliases with A
memcpy(buffer, reinterpret_cast<char*>(&a), sizeof(A));
// chars also aliases with unsigned long
writeWordsToHwRegister(reinterpret_cast<unsigned long*> (buffer), sizeof(A)); 

If you have any more questions, look up "strict aliasing" rules. It is actually a pretty well known issue by now.

ltjax
  • 15,837
  • 3
  • 39
  • 62
  • Wow, had never though of how compilers treated aliasing. Thanks for the heads up. Knowing this was related to strict aliasing made me find the following great answer: http://stackoverflow.com/a/99010/124540 – tombjo Jul 20 '12 at 11:02
  • Yes, of course I am talking about double words and not words. Will correct question to clarify. – tombjo Jul 20 '12 at 11:06
  • Interestingly enough, this does not seam to be what Lint is complaining about though. That is more in line with Tom Tanners answer. – tombjo Jul 20 '12 at 11:11
  • Very interesting - I was pretty sure using char as an intermediate is portable. Why do you think that's not what Lint is complaining about? Warning 740 seems to indicate that aliasing is the issue... Do you get a new warning with those changes? – ltjax Jul 20 '12 at 11:55
  • No, no, sorry for the missunderstanding. I'm sure your example will work and be accepted. What I referred to was the Lint warning relative my original code. In many other warnings they talk about aliasing. In this they talk about pointers not being compatible. to quote: "The main purpose of this message is to report possible problems for machines in which pointer to char is rendered differently from pointer to word." – tombjo Jul 20 '12 at 12:25
  • Also, your solution is probably not acceptable in my application, my embedded system is too performance critical for that (even though I understand some compilers might optimize away the memcpy). Right now I'm thinking about the solution given in the linked answer of wrapping A in another union like: 'union AWrapper{A, unsigned long aAlias[sizeof(A)/sizeof(unsigned long)]}' which I understand, though technically undefined, most often will work fine. – tombjo Jul 20 '12 at 12:34
2

I know that on some machines, pointers to char and pointers to word are actually different, as pointer to char needs extra bits due to the way memory is addressed.

There are some machines (mainly DSPs, but I think old DEC machines did this too) where this is the case.

This means if you reinterpret_cast something to char on one of these machines, the bit pattern is necessarily valid.

As a pointer to a union can in theory point to any member of it, it means a union pointer then has to contain something to allow you to succesfully use it to point to a char or a word. Which in turn means that reinterpret_casting it will end up with bits that mean something to the compiler being used as if they were part of a valid address

For instance if a pointer is 0xfffa where the 'a' is some magic that the compiler uses to help it work out what to do when you say unionptr->charmember (perhaps nothing) and something different when you do unionptr->wordmember (perhaps convert it to 3ff before using it), when you reinterpret_cast it to long *, you still have fffa, because reinterpret_cast does nothing to the bit pattern.

Now you have something the compiler thinks is a pointer to long, containing fffa, whereas it should be (say) 3ff.

Which is likely to result in a nasty crash.

Tom Tanner
  • 9,244
  • 3
  • 33
  • 61
1

A char* can be byte-aligned (anything!), whereas a long* generally needs to be aligned to a 4-byte boundary on any modern processor.

On bigger iron, you'll get some crash when you try accessing a long on a mis-aligned boundary (say SIGBUS on *nix). However, on some embedded systems you can just quietly get some odd results which makes detection difficult.

I've seen this happen on ARM7, and yes, it was hard to see what was going on.

Craig Mc
  • 179
  • 1
  • 2
0

I'm not sure why you think a pointer to char is involved - you're casting a pointer to union A to a pointer to long. The best fix would probably be to change:

void writeWordsToHwRegister(unsigned long* ptr, unsigned long size)

to:

void writeWordsToHwRegister(const void * ptr, unsigned long size)
Paul R
  • 208,748
  • 37
  • 389
  • 560