POD Types Revisited

Some members of the standards committee feel that the current definition of POD types is too strict. They’re proposing changes to the definition of POD classes to solve some of the problem that the C++98 definition causes. However, the new proposal also introduces some new problems, as you will shortly see.

Problems with the C++98 Definition of POD types

The current definition of POD types forces users to make unhealthy design choices in certain scenarios, without any real justification. Let’s look at a few common examples.

Assume a program that has two arrays a1 and a2 of type std::pair<int,int> . Most programmers expect that:

memcpy(a2,a1,sizeof(a2));

should be safe. Indeed, I find it hard to imagine an implementation in which a byte-for-byte copy of std::pair<int,int> shouldn’t work properly. However, C++ 98 doesn’t guarantee that. It says that byte-for-byte copying is safe only when it applies to POD types. Because std::pair<T,U> has a non-trivial constructor, it isn’t a POD type and therefore using memcpy() in this context causes undefined behavior. In practice, programmers often ignore this and use memcpy() for copying objects that aren’t POD-types. This situation isn’t ideal, to say the least. The recent N2062 proposal solves this problem. Before I examine the new proposal, let’s look at other problematic cases.

Consider the following struct. The standard guarantees that by-for-byte copying of instances of S should be safe:

struct S
{
int n;
};
S s1[2],s2[2];
s1[0].n = s1[1].n=9;
memcpy(&s2, &s1, sizeof(s1)); //fine, S is POD

However, if you modify S slightly:

struct T //S with a minor change
{
int n;
T(val) : n(val) { }
};

All bets are off now. T is no longer a POD type because it has a non-trivial constructor. Serializing an array of T objects to a file and then deserializing it isn’t guaranteed to work. By contrast, serializing and deserializing an array of S is safe. This distinction between S and T , which are practically identical in their layouts, seems arbitrary and unintuitive. The new proposal solves this issue by making T a POD type.

Different access types change PODness:

struct A //POD in C++98
{
int n;
double y;
};

struct B //non-POD in C++98
{
private:
int n;
double y;
};

In practice, all implementations use an identical memory layout for A and B . Programmers therefore expect that serializing an array of B objects and deserializing it later, or copying one array of B objects to another array of B using memcpy() should be safe. However, B is not a POD type in C++98 and therefore neither serialization nor memcpy() are guaranteed to work. The N2062 proposal solves this problem by allowing a POD type to have private data members, as long as all data members have the same access specifier.

Enter the New POD Definition

The new proposal decomposes the byte-copyability requirement from the stricter POD requirements. As a result, types that are considered in C++98 as non-POD can still be byte-copyable. Furthermore, the C++98 definition of POD types relies on the definition of aggregates. The new proposal decouples these two concepts of POD types and aggregates.

The new proposal defines a byte-copyable class as:

...a class that has a trivial copy constructor, a trivial copy assignment operator, and a trivial destructor . [Note: Among other requirements, that precludes virtual functions, virtual bases, and members or bases with non-trivial copy constructors, copy assignments, or destructors]

This relaxed definition allows types to define a constructor while still remaining byte-copyable. Additionally, such a class may have non-public data members as long as all data members have the same access type. However, it’s unclear whether class C is byte-copyable or not:

class C
{
private:
int x;
private:
char y;
};

According to the C++98 standard, "[t]he order of allocation of nonstatic data members separated by an access-specifier is unspecified (9.2/12)." The standard doesn’t say that the access specifiers have to be different, though. Therefore, class C doesn’t necessarily have the same binary layout as that of class D :

class D
{
private:
int x;
char y;
};

I agree that the ice here is thin. The new proposal should address this issue more explicitly, though.

The new proposal defines POD types as a subset of byte-copyable types that meet additional requirements. A POD-struct is a byte-copyable class that:

  • Shall not have nonstatic data members of type non-POD-struct, non-POD-union (see below), arrays of such types or reference types.
  • All non-static data members thereof shall have the same access control (but note 9.2/12 above).
  • It shall neither have non-POD base classes, nor base classes with data members.

The following types are all POD-structs:

struct A{}; //POD in C++98 too
struct B: A{}; //non POD in C++98
struct F{
int x;
F() : x(0) {} //non-POD in C++98
};
class G //non-POD in C++98
{
int y; //all data members have the same access type
int z;
public
G();
};

Similarly, a POD-union is defined as a union that has no nonstatic data members of type non-POD-struct, non-POD-union, arrays of such types or reference types, and has no user-declared copy assignment operator and no user-declared destructor. A POD class is a class that is either a POD-struct or a POD-union.

The following are POD-unions:

union Empty
{};
union U
{
A a[2];
B b;
};

Needless to say, arithmetic types, pointer types, pointers to members and enum types (all of which are collectively known as scalar types ) remain POD types as before. Notice however that references are not POD types. This isn’t exactly a new rule; C++98 originally didn’t address the status of references. A recent Technical Corrigendum fixed this by explicitly excluding references from the set of POD types.

Problems with the New Proposal

Apart from the ambiguity of class C above, the proposed changes will cause some existing non-POD types to become POD types (see examples above). This silent change could disable some optimizations that are applicable to non-POD types exclusively. To ensure that a C++98 non-POD type remain a non-POD types under the new rules, the authors recommend adding an empty destructor definition. For example,

struct B
{
private:
int n;
double y;
};

In C++98 B is a non-POD type since it has non-public nonstatic data members. According to the new proposal, B becomes a POD-type. If this silent change is undesirable (for example, if you still want to allow the compiler to optimize memory layout by reordering members), you need to modify the definition of B to:

struct B
{
private:
int n;
double y;
public:
B(){}
};

This recommendation isn’t ideal, though. First, users of third-party classes may not have access to the definitions of classes that they are using. Additionally, adding an empty destructor might tempt code reviewers to remove the seemingly useless destructor (personally, that’s what I do when I see such destructors today!). Finally, if the destructor has the incorrect access type because the access type wasn’t explicitly specified, you will get compilation errors in source files that have compiled perfectly up until now. The best solution is to introduce a transitional compiler flag or macro that guarantees the POD-ness of types.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章