爲什麼 C++ 中成員函數指針是 16 字節？

原創

hazir

2018-09-06 01:58

當我們討論指針時，通常假設它是一種可以用 void * 指針來表示的東西，在 x86_64 平臺下是 8 個字節大小。例如，下面是來自維基百科中關於 x86_64 的文章的摘錄：

Pushes and pops on the stack are always in 8-byte strides, and pointers are 8 bytes wide.

從 CPU 的角度來看，指針無非就是內存的地址，所有的內存地址在 x86_64 平臺下都是由 64 位來表示，所以假設它是 8 個字節是正確的。通過簡單輸出不同類型指針的長度，這也不難驗證我們所說的。

#include <iostream>

int main() {
    std::cout <<
        "sizeof(int*)      == " << sizeof(int*) << "\n"
        "sizeof(double*)   == " << sizeof(double*) << "\n"
        "sizeof(void(*)()) == " << sizeof(void(*)()) << std::endl;
}

編譯運行上面的程序，從結果中可以看出所有的指針的長度都是 8 個字節：

$ uname -i
x86_64
$ g++ -Wall ./example.cc
$ ./a.out
sizeof(int*)      == 8
sizeof(double*)   == 8
sizeof(void(*)()) == 8

然而在 C++ 中還有一種特例——成員函數的指針。很有意思吧，成員函數指針是其它任何指針長度的兩倍。這可以通過下面簡單的程序來驗證，輸出的結果是 “16”：

#include <iostream>

struct Foo {
    void bar() const { }
};

int main() {
    std::cout << sizeof(&Foo::bar) << std::endl;
}

這是否以爲着維基百科上錯了呢？顯然不是！從硬件的角度來看，所有的指針仍然是 8 個字節。既然如此，那麼成員函數的指針是什麼呢？這是 C++ 語言的特性，這裏成員函數的指針不是直接映射到硬件上的，它由運行時（編譯器）來實現，會帶來一些額外的開銷，通常會導致性能的損失。C++ 語言規範中並沒有提到實現的細節，也沒有解釋這種類型指針。幸運的是，Itanium C++ ABI 規範中共享了 C++ 運行時實現的細節——舉例來說，它解釋了 Virtual Table、RTTI 和異常是如何實現的，在 §2.3 中也解釋了成員指針：

A pointer to member function is a pair as follows:

ptr:

For a non-virtual function, this field is a simple function pointer. For a virtual function, it is 1 plus the virtual table offset (in bytes) of the function, represented as a ptrdiff_t. The value zero represents a NULL pointer, independent of the adjustment field value below.

adj:

The required adjustment to this, represented as a ptrdiff_t.

所以，成員指針是 16 字節而不是 8 字節，因爲在簡單函數指針的後面還需要保存怎樣調整 “this" 指針（總是隱式地傳遞給非靜態成員函數）的信息。 ABI 規範並沒有說爲什麼以及什麼時候需要調整 this 指針。可能一開始並不是很明顯，讓我們先看下面類繼承的例子：

struct A {
    void foo() const { }
    char pad0[32];
};

struct B {
    void bar() const { }
    char pad2[64];
};

struct C : A, B
{ };

A 和 B 都有一個非靜態成員函數以及一個數據成員。這兩個方法可以通過隱式傳遞給它們的 “this" 指針來訪問到它們類中的數據成員。爲了訪問到任意的數據成員，需要在 "this" 指針上加上一個偏移，偏移是數據成員到類對象基址的偏移，可以由 ptrdiff_t 來表示。然而事情在多重繼承時將會變得更復雜。我們有一個類 C 繼承了 A 和 B，將會發生什麼呢？編譯器將 A 和 B 同時放到內存中，B 在 A 之下，因此，A 類的方法和 B 類的方法看到的 this 指針的值是不一樣的。這可以通過實踐來簡單驗證，如：

#include <iostream>

struct A {
    void foo() const {
        std::cout << "A's this: " << this << std::endl;
    }
    char pad0[32];
};

struct B {
    void bar() const {
        std::cout << "B's this: " << this << std::endl;
    }
    char pad2[64];
};

struct C : A, B
{ };

int main()
{
    C obj;
    obj.foo();
    obj.bar();
}

$ g++ -Wall -o test ./test.cc && ./test
A's this: 0x7fff57ddfb48
B's this: 0x7fff57ddfb68

正如你看到的，“this” 指針的值傳給 B 的方法要比 A 的方法要大 32 字節——一個類 A 對象的實際大小。但是，當我們用下面的函數通過指針來調用類 C 的方法時，會發生什麼呢？

void call_by_ptr(const C &obj, void (C::*mem_func)() const) {
    (obj.*mem_func)();
}

與調用什麼函數有關，不同的 "this" 指針值會被傳遞到這些函數中。但是 call_by_ptr 函數並不知道它的參數是 foo() 的指針還是 bar() 的指針，能知道該信息的唯一時機是這些方法使用時。這就是爲什麼成員函數的指針在調用之前需要知道如何調整 this 指針。現在，我們將所有的放到一個簡單的程序，闡釋了內部工作的機制：

#include <iostream>

struct A {
    void foo() const {
        std::cout << "A's this:\t" << this << std::endl;
    }
    char pad0[32];
};

struct B {
    void bar() const {
        std::cout << "B's this:\t" << this << std::endl;
    }
    char pad2[64];
};

struct C : A, B
{ };

void call_by_ptr(const C &obj, void (C::*mem_func)() const)
{
    void *data[2];
    std::memcpy(data, &mem_func, sizeof(mem_func));
    std::cout << "------------------------------\n"
        "Object ptr:\t" << &obj <<
        "\nFunction ptr:\t" << data[0] <<
        "\nPointer adj:\t" << data[1] << std::endl;
    (obj.*mem_func)();
}

int main()
{
    C obj;
    call_by_ptr(obj, &C::foo);
    call_by_ptr(obj, &C::bar);
}

上面的程序輸出如下：

------------------------------
Object ptr:    0x7fff535dfb28
Function ptr:  0x10c620cac
Pointer adj:   0
A's this:    0x7fff535dfb28
------------------------------
Object ptr:    0x7fff535dfb28
Function ptr:  0x10c620cfe
Pointer adj:   0x20
B's this:    0x7fff535dfb48

希望本文能使問題變得更明確一點。

譯自：http://741mhz.com/wide-pointers/

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

爲什麼 C++ 中成員函數指針是 16 字節？

redis的key亂碼問題和值自增問題

CORS error 但是 status code 是200 OK

一個開源且全面的C#算法實戰教程

一款.NET開源、功能強大、跨平臺的繪圖庫 - OxyPlot

壓縮上傳的GPU數據的方案

OpenTelemetry 實踐指南：歷史、架構與基本概念

需求管理祕籍：從混亂到有序，讓你的項目高效運轉

使用skopeo同步鏡像

用光線投射法渲染規則模型

GCC 中零長數組與變長數組

淘寶2011春季校園招聘筆試試題（答案+個人解析版）

2010年最具影響力的十件大事

GNU Readline 庫及編程簡介

Python Socket 編程——聊天室示例程序

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結