Ben Anderson wrote:
What is it about casting of pointers that is bad? Is it de-referencing
pointers to un-aligned data elements?
Correct.
Generally speaking, if you have to cast a pointer then you must have lied to the
compiler at some point about what the object in question actually is. That's
almost always a bad idea, particularly so when writing portable code. Better to
clearly convey to the compiler what's going on, and let it and the rules of C
work to your advantage.
I don't mean to say that casting of any type in general is good. I am
just trying to get a firm grasp on what the issues are.
If you cast a char* to an int*, then you risk problems on machines where the
alignment restrictions are different for the two data types. x86 doesn't care,
so if your char wasn't word-aligned then when you dereference the casted pointer
to int, nothing bad happens. On ARM, however, you get an exception (*).
* - except on some of the newest ARM cores, apparently. I avoid the problem so
that I don't have to care which core I'm running on!
These kinds of problems can be nasty to test for, because they seem to travel
along with dynamically-allocated data structures, buffers, etc. that are very
sensitive to system state. So you might make several passes over the same code
successfully before the !kaboom! happens. Better to prove the code right by
inspection beforehand, which is only possible if you follow C's rules carefully.
I looked through some of Dan Saks articles and in one he does mention
caution against casting of pointers but no real details.
I still having hard time finding this info via google. So if anyone
knows of some site that goes through the alignment/pointer issue with
possibly some examples let me know. It would be much appreciated.
Just don't cast pointers, and you should be fine. Here's a bad one I see from
time to time:
int i;
char *ibuf = (char*)&i;
char a, b, c, d;
/* break apart i into its four bytes */
a = ibuf[0];
b = ibuf[1];
c = ibuf[2];
d = ibuf[3];
The values of a, b, c, and d will be different on x86 vs. ARM. Ditto if you go
the opposite way:
char cbuf[sizeof(int)];
int i;
i = cbuf[0];
i = (i << 8) + cbuf[1];
i = (i << 8) + cbuf[2];
i = (i << 8) + cbuf[3];
This code is actually portable if you load up cbuf the right way each time,
regardless of the endianness of the machine (which can be tricky). BUT, you
will still get different values for i if char's are signed on one machine but
unsigned on another. BUT BUT, you won't see the problem until the
most-significant bit of a byte in cbuf is set.
The two above examples aren't casts per-se, but they are definitely
representation transformations of the same type that casts cause. So I lump
them together.
Just watch out for stuff like that. You tend to know when you're in risky
territory, because the code starts to look very much like the above.
b.g.