What is the purpose of the __builtin_offsetof operator (or _FOFF operator in Symbian) in C++?
In addition what does it return? Pointer? Number of bytes?
-
It's a builtin provided by the GCC compiler to implement the
offsetof
macro that is specified by the C and C++ Standard:It returns the offset in bytes that a member of a POD struct/union is at.
Sample:
struct abc1 { int a, b, c; }; union abc2 { int a, b, c; }; struct abc3 { abc3() { } int a, b, c; }; // non-POD union abc4 { abc4() { } int a, b, c; }; // non-POD assert(offsetof(abc1, a) == 0); // always, because there's no padding before a. assert(offsetof(abc1, b) == 4); // here, on my system assert(offsetof(abc2, a) == offsetof(abc2, b)); // (members overlap) assert(offsetof(abc3, c) == 8); // undefined behavior. GCC outputs warnings assert(offsetof(abc4, a) == 0); // undefined behavior. GCC outputs warnings
@Jonathan provides a nice example of where you can use it. I remember having seen it used to implement intrusive lists (lists whose data items include next and prev pointers itself), but i can't remember where it was helpful in implementing it, sadly.
Steve Jessop : I'd guess where it was useful was that the intruded nodes contain pointers to the node in the "next" object. When using the list, you need to get from the node to the base of the object, so you subtract offsetof(something) bytes from the pointer value and reinterpret_cast.Steve Jessop : All very non-portable in C++, of course, but does the job in C. -
As @litb, said: the offset in bytes of a struct/class member. In C++ there are cases where it is undefined, in case the compiler will complain. IIRC, one way to implement it (in C, at least) is to do
#define offsetof(type, member) (int)(&((type *)0)->member)
But I'm sure there are problems this, but I'll leave that to the interested reader to point out...
MSalters : Undefined behavior, even in C. Multiple reasons, even: redefining a std macro and deref of NULL. Common in stdlib, though, since that's bound by different rules.Robert S. Barnes : @MSalters - JesperE is correct. See the definition in the stddef.h in the Linux Kernel source code: http://lxr.linux.no/#linux+v2.6.31/include/linux/stddef.h#L24 -
As @litb points out and @JesperE shows, offsetof() provides an integer offset in bytes (as a
size_t
value).When might you use it?
One case where it might be relevant is a table-driven operation for reading an enormous number of diverse configuration parameters from a file and stuffing the values into an equally enormous data structure. Reducing enormous down to SO trivial (and ignoring a wide variety of necessary real-world practices, such as defining structure types in headers), I mean that some parameters could be integers and others strings, and the code might look faintly like:
#include <stddef.h> typedef stuct config_info config_info; struct config_info { int parameter1; int parameter2; int parameter3; char *string1; char *string2; char *string3; int parameter4; } main_configuration; typedef struct config_desc config_desc; static const struct config_desc { char *name; enum paramtype { PT_INT, PT_STR } type; size_t offset; int min_val; int max_val; int max_len; } desc_configuration[] = { { "GIZMOTRON_RATING", PT_INT, offsetof(config_info, parameter1), 0, 100, 0 }, { "NECROSIS_FACTOR", PT_INT, offsetof(config_info, parameter2), -20, +20, 0 }, { "GILLYWEED_LEAVES", PT_INT, offsetof(config_info, parameter3), 1, 3, 0 }, { "INFLATION_FACTOR", PT_INT, offsetof(config_info, parameter4), 1000, 10000, 0 }, { "EXTRA_CONFIG", PT_STR, offsetof(config_info, string1), 0, 0, 64 }, { "USER_NAME", PT_STR, offsetof(config_info, string2), 0, 0, 16 }, { "GIZMOTRON_LABEL", PT_STR, offsetof(config_info, string3), 0, 0, 32 }, };
You can now write a general function that reads lines from the config file, discarding comments and blank lines. It then isolates the parameter name, and looks that up in the
desc_configuration
table (which you might sort so that you can do a binary search - multiple SO questions address that). When it finds the correctconfig_desc
record, it can pass the value it found and theconfig_desc
entry to one of two routines - one for processing strings, the other for processing integers.The key part of those functions is:
static int validate_set_int_config(const config_desc *desc, char *value) { int *data = (int *)((char *)&main_configuration + desc->offset); ... *data = atoi(value); ... } static int validate_set_str_config(const config_desc *desc, char *value) { char **data = (char **)((char *)&main_configuration + desc->offset); ... *data = strdup(value); ... }
This avoids having to write a separate function for each separate member of the structure.
Robert S. Barnes : If you wanted to get really evil you could use a hash table containing the parameter names and indexes into `desc_configuration`. Really amazing example by the way.Jonathan Leffler : @Robert: this example is closely based on reading data from a configuration file into a big data structure, and reversing the process. I won't bother to explain how it is currently done: suffice to say, there are 300 parameters, about 4500 lines of code in the function that handles it all, and a lot of repetition. I am not in charge of the code - sadly.Jonathan Leffler : See also: http://stackoverflow.com/questions/1445762/need-way-to-alter-common-fields-in-different-structs -
The purpose of a built-in __offsetof operator is that the compiler vendor can continue to #define an offsetof() macro, yet have it work with classes that define unary operator&. The typical C macro definition of offsetof() only worked when (&lvalue) returned the address of that rvalue. I.e.
#define offsetof(type, member) (int)(&((type *)0)->member) // C definition, not C++ struct CFoo { struct Evil { int operator&() { return 42; } }; Evil foo; }; ptrdiff_t t = offsetof(CFoo, foo); // Would call Evil::operator& and return 42
0 comments:
Post a Comment