字节对齐及why

32位系统默认4字节对齐(字段大小小于4字节时,以最大的字段大小对齐,字段大小大于等于4字节时以4字节对齐):

因为地址总线的关系,有2根总线不参与寻址,导致只能获取到4的整数倍的地址,所以默认是4字节对齐。

指针大小: 4个字节(32位机的寻址空间是4个字节)。

class DBase{
  //double b;
  char c;
  //int a;
};

Class DBase
   size=1 align=1   //内存大小小于4字节时,以最大的字节对齐
   base size=1 base align=1
DBase (0x7fd0dcd3fc40) 0
 

class DBase{
  //double b;
  char c;
 short d;
  //int a;
};

Class DBase
   size=4 align=2
   base size=4 base align=2
DBase (0x7fe1be2f0d20) 0

 

class DBase{
  //double b;
  int a;
}; //cout<<sizeof(DBase)<<endl; // 4
Class DBase
   size=4 align=4
   base size=4 base align=4
DBase (0x7f5c45c6fc40) 0

 

class DBase{
  double b; //字段大小超4字节
  int a;
};

Class DBase
   size=12 align=4
   base size=12 base align=4
DBase (0x7fbff5093c40) 0

 

64位系统默认8字节对齐:《The Intel 64 and IA-32 Architectures Software Developer's Manual》

(字段大小小于8字节时,以最大的字段大小对齐,字段大小大于等于8字节时以8字节对齐)

指针大小: 8个字节

为了性能考虑,8字节的数据需要8字节对齐和cpu缓冲命中率有关系

There are multiple hardware components that may be adversely affected by unaligned loads or stores.

  • The interface to memory might be eight bytes wide and only able to access memory at multiples of eight bytes. Loading an unaligned eight-byte double then requires two reads on the bus. Stores are worse, because an aligned eight-byte store can simply write eight bytes to memory, but an unaligned eight-byte store must read two eight-byte pieces, merge the new data with the old data, and write two eight-byte pieces.  内存访问接口的限制,只能以8字节及8字节整数倍,非8字节对齐的数据要读两次(读取效率上考虑),而且在存储上不以8字节对齐更加麻烦,影响效率。
  • Cache lines are typically 32 or 64 bytes. If eight-byte objects are aligned to multiples of eight bytes, then each object is in just one cache line. If they are unaligned, then some of the objects are partly in one cache line and partly in another. Loading or storing these objects then requires using two cache lines instead of one. This effect occurs at all levels of cache (three levels is not uncommon in modern processors).3个级别的处理总线很普遍,而在高速缓存线路上目前主要是32字节或64字节型,如果不对齐将导致一些对象的内存部分在一个线路上,而另一部分在另一个线路上。这将导致加载和存储复杂而低效。
  • Memory system pages are typically 512 bytes or more. Again, each aligned object is in just one page, but some unaligned objects are in multiple pages. Each page that is accessed requires hardware resources: The virtual address must be translated to a physical address, this may require accessing translation tables, and address collisions must be detected. (Processors may have multiple load and store operations in operation simultaneously. Even though your program may appear to be single-threaded, the processor reads instructions in advance and tries to execute those that it can. So a processor may start a load instruction before preceding instructions have completed. However, to be sure this does not cause an error, the processor checks each load instruction to be sure it is not loading from an address that a prior store instruction is changing. If an access crosses a page boundary, the two parts of the loaded data have to be checked separately.) 系统页大小512,或更大,如果不对齐将导致更多的数据分布在不同的内存页,每个页的访问都需要消耗硬件资源。巴拉巴拉总之就是资源和性能的问题

class DBase{
  //double b;
  char c;
  //int a;
};

Class DBase
   size=1 align=1
   base size=1 base align=1
DBase (0x7f462135fd20) 0

 

class DBase{
  //double b;
  int a;
};////cout<<sizeof(DBase)<<endl; // 4

Class DBase
   size=4 align=4 //字段最大为4,故以4字节对齐
   base size=4 base align=4
DBase (0x7f9b7275cd20) 0

 class DBase{
  double b;
  int a;
};

Class DBase
   size=16 align=8  //8字节对齐
   base size=12 base align=8
DBase (0x7fb6489fcd20) 0

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章