offsetof()

Add comments to offsetof() wiki to understand its implementation.

The "traditional" implementation of the macro relied on the compiler obtaining the offset of a member by specifying a hypothetical structure that begins at address zero:

#define offsetof(st, m) \
    ((size_t)&(((st *)0)->m))

This can be understood as taking a null pointer of type structure st, and then obtaining the address of member m within said structure. While this implementation works correctly in many compilers, it has generated some debate if this is undefined behavior according to the C standard,[2] since it appears to involve a dereference of a null pointer (although, according to the standard, section 6.6 Constant Expressions, Paragraph 9, the value of the object is not accessed by the operation). It also tends to produce confusing compiler diagnostics if one of the arguments is misspelled.[citation needed]

Comments 1: Actually there is no dereference at all. A dereference occurs when the * or -> is used on an address value to find referenced value. The only use of * above is in a type declaration for the purpose of casting. The -> operator is used above but it's not used to access the value. Instead it's used to grab the address of the value. See stackoverflow for more details.

Comments 2: will &(((st *)0)->m) leads to undefined behavior (UB), since (st *)0 does not point to any object and use ((st *)0)->m? As stackoverflow states, from a lawyer point of view, the expression should lead to UB, since you could not find a path in which there would be no UB. Since compiler use this form, it may allow program use same form or issue a warning when use compiler to compile the program......

An alternative is:

#define offsetof(st, m) \
    ((size_t)((char *)&((st *)0)->m - (char *)0))

It may be specified this way because the standard does not specify that the internal representation of the null pointer is at address zero. Therefore the difference between the member address and the base address needs to be made. Again, since these are constant expressions it can be calculated at compile time and not necessarily at run-time.

Comments: The 0 in offsetof() implementation stands for null pointer NULL as following. Since the standard does not specify whether NULL is 0 or not, it may be 0 or may be other value. If it's 0, then following implementation is fine. But if NULL is not 0, say it's 0xffff ffff, then following implementation will produce 0xffff ffff+m's offset, so need minus 0xffff ffff.

#define offsetof(st, m) \
    ((size_t)&(((st *)NULL)->m))

Some modern compilers (such as GCC) define the macro using a special form (as a language extension) instead, e.g.[3]

#define offsetof(st, m) \
    __builtin_offsetof(st, m)

This builtin is especially useful with C++ classes or structs that declare a custom unary operator &.[4]

Comments: not familiar it.

It's said that offsetof() is a compile time macro, but I don't know why? It may come clear later.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章