I'm curious about conventions for type-punning pointers/arrays in C++. Here's the use case I have at the moment:
Compute a simple 32-bit checksum over a binary blob of data by treating it as an array of 32-bit integers (we know its total length is a multiple of 4), and then summing up all values and ignoring overflow.
I would expect such an function to look like this:
uint32_t compute_checksum(const char *data, size_t size)
{
const uint32_t *udata = /* ??? */;
uint32_t checksum = 0;
for (size_t i = 0; i != size / 4; ++i)
checksum += udata[i];
return udata;
}
Now the question I have is, what do you consider the "best" way to convert data
to udata
?
C-style cast?
udata = (const uint32_t *)data
C++ cast that assumes all pointers are convertible?
udata = reinterpret_cast<const uint32_t *>(data)
C++ cast that between arbitrary pointer types using intermediate void*
?
udata = static_cast<const uint32_t *>(static_cast<const void *>(data))
Cast through a union?
union {
const uint32_t *udata;
const char *cdata;
};
cdata = data;
// now use udata
I fully realize that this will not be a 100% portable solution, but I am only expecting to use it on a small set of platforms where I know it works (namely unaligned memory accesses and compiler assumptions on pointer aliasing). What would you recommend?
-
As far as the C++ standard is concerned, litb's answer is completely correct and the most portable. Casting
const char *data
to aconst uint3_t *
, whether it be via a C-style cast,static_cast
, orreinterpret_cast
, breaks the strict aliasing rules (see Understanding Strict Aliasing). If you compile with full optimization, there's a good chance the code will not to the right thing.Casting through a union (such as litb's
my_reint
) is probably the best solution, although it does technically violate the rule that if you write to a union through one member and read it through another, it results in undefined behavior. However, practically all compilers support this, and it results in the the expected result. If you absolutely desire to conform to the standard 100%, go with the bit-shifting method. Otherwise, I'd recommend going with casting through a union, which is likely to give you better performance.Tom : litb's solutions is correct by the standard - but as I said, I'm looking at specific platforms already.Johannes Schaub - litb : i'm not sure why they downvote this:) but i'm also not sure that my use of the union is undefined behavior.i'm aware that writing to a member and reading from another member is undefined behavior.but in my case, i'm pointing to a member of it,which is assumed to have a valid value,and read it then.Steve Jessop : I don't think this particular example does break strict aliasing. char* is a special case under strict aliasing rules - a char* may never be assumed not an alias of a pointer to some other type. But in my answer I still play safe: it's just not worth doing char* differently from other similar cases.Johannes Schaub - litb : onebyone, type punning is not the issue, it's already solved by the union. but the issue is the reading from the union member even though we didnt write to it before. the standard does not seem to forbid it. but that's the question we are unsure about :/Johannes Schaub - litb : @Adam, i have decided to delete my answer, because i found i don't agree with most of its interpretations anymore. But i still agree with your stuff in this answer.Steve Jessop : @litb: What I meant is (if I can remember back that far): contrary to what Adam says, casting `const char*` to `const uint32_t*` *does not* break strict aliasing rules. Aliasing a `uint32_t*` with a `char*` is allowed, and safe, and optimisation doesn't change that. -
This looks like a case-book example of when to use
reinterpret_cast
, anything else will give you the same effect without the explicitness you get from using a language construct for its official use. -
Ignoring efficiency, for simplicity of code I'd do:
#include <numeric> #include <vector> #include <cstring> uint32_t compute_checksum(const char *data, size_t size) { std::vector<uint32_t> intdata(size/sizeof(uint32_t)); std::memcpy(&intdata[0], data, size); return std::accumulate(intdata.begin(), intdata.end(), 0); }
I also like litb's last answer, the one that shifts each char in turn, except that since char might be signed, I think it needs an extra mask:
checksum += ((data[i] && 0xFF) << shift[i % 4]);
When type punning is a potential issue, I prefer not to type pun rather than to try to do so safely. If you don't create any aliased pointers of distinct types in the first place, then you don't have to worry what the compiler might do with aliases, and neither does the maintenance programmer who sees your multiple static_casts through a union.
If you don't want to allocate so much extra memory, then:
uint32_t compute_checksum(const char *data, size_t size) { uint32_t total = 0; for (size_t i = 0; i < size; i += sizeof(uint32_t)) { uint32_t thisone; std::memcpy(&thisone, &data[i], sizeof(uint32_t)); total += thisone; } return total; }
Enough optimisation will get rid of the memcpy and the extra uint32_t variable entirely on gcc, and just read an integer value unaligned, in whatever the most efficient way to do that is on your platform, straight out of the source array. I'd hope the same is true of other "serious" compilers. But this code is now bigger than litb's, so there's not much to be said for it other than mine is easier to turn into a function template that will work just as well with uint64_t, and mine works as native endian-ness rather than picking little-endian.
This is of course not completely portable. It assumes that the storage representation of sizeof(uint32_t) chars corresponds to the storage representation of a uin32_t in the way we want. This is implied by the question, since it states that one can be "treated as" the other. Endian-ness, whether a char is 8 bits, and whether uint32_t uses all bits in its storage representation can obviously intrude, but the question implies that they won't.
Tom : I just tried your last example. GCC refuses to vectorize it, complaining about "unhandled data-ref".Steve Jessop : Fair enough, it's not the fastest possible on hardware supporting vector ops. Maybe GCC will get better in future. My extreme programming guru says I don't need to lose sleep over that. My first suggestion isn't as fast as possible on *any* hardware :-)Steve Jessop : But I agree that since I mentioned the memcpy being low-cost, the fact that it prevents vectorization could be a show-stopper for some applications.
0 comments:
Post a Comment