ATL技术内幕 第一部分

前言:这一系列博客翻译自 The Code Project 上的文章,作者是Zeeshan amjad。

题目:ATL Under the Hood -  Part 1

原文链接:http://www.codeproject.com/Articles/1769/ATL-Under-the-Hood-Part-1

 

介绍

这个系列我们将讨论ATL的内部机制以及技术细节。

让我们先来看一看程序的内存布局。下面是一个简单的例子,其中的类没有任何数据成员,我们看看其内存结构:

程序1.

#include <iostream>
using namespace std;
class Class {
};
int main() {
        Class objClass;
        cout << "Size of object is = " << sizeof(objClass) << endl;
        cout << "Address of object is = " << &objClass << endl;
        return 0;
}

程序输入如下:

Size of object is = 1
Address of object is = 0012FF7C
如果我们为这个类添加数据成员,那类的大小就是其数据成员的和,这个规则对于类模板一样适用。下面这个类模板定义了一个Point类。

程序 2.

#include <iostream>
using namespace std;

template <typename T>
class CPoint {
public:
        T m_x;
        T m_y;
};

int main() {
        CPoint<int> objPoint;
        cout << "Size of object is = " << sizeof(objPoint) << endl;
        cout << "Address of object is = " << &objPoint << endl;
        return 0;
}

现在的输出结果是:

Size of object is = 8
Address of object is = 0012FF78
接下来我们添加继承。我们有Point类继承得到Point3D类,看看其内存情况:

程序 3

#include <iostream>
using namespace std;

template <typename T>
class CPoint {
public:
        T m_x;
        T m_y;
};

template <typename T>
class CPoint3D : public CPoint<T> {
public:
        T m_z;
};

int main() {
        CPoint<int> objPoint;
        cout << "Size of object Point is = " << sizeof(objPoint) << endl;
        cout << "Address of object Point is = " << &objPoint << endl;

        CPoint3D<int> objPoint3D;
        cout << "Size of object Point3D is = " << sizeof(objPoint3D) << endl;
        cout << "Address of object Point3D is = " << &objPoint3D << endl;

        return 0;
}

程序现在的输出是:

Size of object Point is = 8
Address of object Point is = 0012FF78
Size of object Point3D is = 12
Address of object Point3D is = 0012FF6C
这个程序显示了派生类的内存结构为:子类占用内存是其自身数据成员占用内存加上父类占用的内存。但如果类中有一个虚函数,事情却发生了变化。看下面的例子:

程序4.

#include <iostream>
using namespace std;

class Class {
public:
        virtual void fun() { cout << "Class::fun" << endl; }
};

int main() {
        Class objClass;
        cout << "Size of Class = " << sizeof(objClass) << endl;
        cout << "Address of Class = " << &objClass << endl;
        return 0;
}

输出结果如下:

Size of Class = 4
Address of Class = 0012FF7C

如果类中包含多个虚函数时,我们看看其结果:



程序5.

#include <iostream>
using namespace std;

class Class {
public:
        virtual void fun1() { cout << "Class::fun1" << endl; }
        virtual void fun2() { cout << "Class::fun2" << endl; }
        virtual void fun3() { cout << "Class::fun3" << endl; }
};

int main() {
        Class objClass;
        cout << "Size of Class = " << sizeof(objClass) << endl;
        cout << "Address of Class = " << &objClass << endl;
        return 0;
}
这个程序的输出和上面的“程序5”输出结果一样。让我们再深入实验。请看下面的代码。

程序6.

#include <iostream>
using namespace std;

class CPoint {
public:
        int m_ix;
        int m_iy;
        virtual ~CPoint() { };
};

int main() {
        CPoint objPoint;
        cout << "Size of Class = " << sizeof(objPoint) << endl;
        cout << "Address of Class = " << &objPoint << endl;
        return 0;
}

这个程序的输出是:

Size of Class = 12
Address of Class = 0012FF68
这些程序的结果表明,当我们给一个类添加虚函数时,他的大小会增加一个int的大小(比如,对于Visual C++来说,这个值为4)。这就意味着,类中有三个位置,一个用来放置x,一个用来放置y,还有一个用来处理虚函数的东西,我们叫他虚指针。首先让我们来看看位于对象的开头(或结尾)的这个叫做虚指针的位置。我们通过直接获取对象内存来查看这个位置。所以,首先我们要将对象的地址保存到一个int型指针中,然后我们通过神奇的指针运算来观察这个虚指针。

程序 7

#include <iostream>
using namespace std;

class CPoint {
public:
        int m_ix;
        int m_iy;
        CPoint(const int p_ix = 0, const int p_iy = 0) : 
                m_ix(p_ix), m_iy(p_iy) { 
        }
        int getX() const {
                return m_ix;
        }
        int getY() const {
                return m_iy;
        }
        virtual ~CPoint() { };
};

int main() {
        CPoint objPoint(5, 10);

        int* pInt = (int*)&objPoint;
        *(pInt+0) = 100;        // 想改变x的值
        *(pInt+1) = 200;        // 想改变y的值

        cout << "X = " << objPoint.getX() << endl;
        cout << "Y = " << objPoint.getY() << endl;

        return 0;
}

这个程序中关键点在于:

        int* pInt = (int*)&objPoint;
        *(pInt+0) = 100;        // 想改变x的值
        *(pInt+1) = 200;        // 想改变y的值
我们将对象的地址放入int型指针内部,然后通过操作这个int型指针来操作对象。程序输出如下:
X = 200
Y = 10
这不是我们想要的结果。结果显示,200存入了m_ix所在的内存中。这意味着m_ix,也就是这个对象的第一个成员变量从存放在这个对象起始内存的第二个位置,而不是第一个位置。起始,第一个位置放置的便是虚指针,然后才存放对象的数据成员。修改下面两行



       *(pInt+1) = 100; //想改x的值

        *(pInt+2) = 200;        // 想改变y的值
我们就可以得到想要的结果,完整的程序如下:

程序 8.

#include <iostream>
using namespace std;

class CPoint {
public:
        int m_ix;
        int m_iy;
        CPoint(const int p_ix = 0, const int p_iy = 0) : 
                m_ix(p_ix), m_iy(p_iy) { 
        }
        int getX() const {
                return m_ix;
        }
        int getY() const {
                return m_iy;
        }
        virtual ~CPoint() { };
};

int main() {
        CPoint objPoint(5, 10);

        int* pInt = (int*)&objPoint;
        *(pInt+1) = 100;        // 想改变x的值
        *(pInt+2) = 200;        // 想给边y的值

        cout << "X = " << objPoint.getX() << endl;
        cout << "Y = " << objPoint.getY() << endl;

        return 0;
}

程序输出结果如下:

X = 100
Y = 200
上述代码表明,当我们为一个类添加虚函数时,虚函数指针将会被加载对象内存的最开始。

现在问题出来了,虚指针里面到底存了什么?为了解这个问题,我们来看下面的代码:

程序 9.

#include <iostream>

using namespace std;

class Class {
        virtual void fun() { cout << "Class::fun" << endl; }
};

int main() {
        Class objClass;

        cout << "Address of virtual pointer " << (int*)(&objClass+0) << endl;
        cout << "Value at virtual pointer " << (int*)*(int*)(&objClass+0) << endl;
        return 0;
}

程序输入如下:

Address of virtual pointer 0012FF7C
Value at virtual pointer 0046C060
虚指针存了虚表的地址。而虚表存了这个类的所有虚函数。也就是说,虚表就是一个存了所有虚函数地址的数组。让我们看如下的程序:

程序 10.

#include <iostream>
using namespace std;

class Class {
        virtual void fun() { cout << "Class::fun" << endl; }
};

typedef void (*Fun)(void);

int main() {
        Class objClass;

        cout << "Address of virtual pointer " << (int*)(&objClass+0) << endl;
        cout << "Value at virtual pointer i.e. Address of virtual table " 
                 << (int*)*(int*)(&objClass+0) << endl;
        cout << "Value at first entry of virtual table " 
                 << (int*)*(int*)*(int*)(&objClass+0) << endl;

        cout << endl << "Executing virtual function" << endl << endl;
        Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);
        pFun();
        return 0;
}
这个程序包含了一些不常见的类型强制转换以及解引用。最重要的一行如下:
        Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);
这里的Fun是用typedef的一个函数指针:
        typedef void (*Fun)(void);
让我们看看这个冗长的解引用。(int*)(&objClass+0)表示存在这个类初始地址的虚指针的地址,我们将其强制转化为int*。为了获取这个虚指针,我们应用了解引用操作符*,然后又将结果(也就是这个虚指针)强制转化为int*,这个地址便是虚函数表的地址了,为了得到这个位置的值,也就是该类第一个虚函数的地址,我们还要用一次解引用操作符,然后再强制转化为合适的(也就是我们定义的Fun)函数指针类型。所以
        Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);
意思就是:获取虚表的第一项内容,然后将其强制转化为Fun类型,最后存入pFun中这个变量中。


如果类中再加入一个虚函数,结果会是怎样呢?现在我们想获取虚表中的第二个元素。让我们继续看下面的程序:


程序 11.

#include <iostream>
using namespace std;

class Class {
        virtual void f() { cout << "Class::f" << endl; }
        virtual void g() { cout << "Class::g" << endl; }
};

int main() {
        Class objClass;

        cout << "Address of virtual pointer " << (int*)(&objClass+0) << endl;
        cout << "Value at virtual pointer i.e. Address of virtual table " 
                << (int*)*(int*)(&objClass+0) << endl;

        cout << endl << "Information about VTable" << endl << endl;
        cout << "Value at 1st entry of VTable " 
                << (int*)*((int*)*(int*)(&objClass+0)+0) << endl;
        cout << "Value at 2nd entry of VTable " 
                << (int*)*((int*)*(int*)(&objClass+0)+1) << endl;
        
        return 0;
}

程序输出结果如下:

Address of virtual pointer 0012FF7C
Value at virtual pointer i.e. Address of virtual table 0046C0EC

Information about VTable

Value at 1st entry of VTable 0040100A
Value at 2nd entry of VTable 0040129E


现在一个问题自然而然的产生:编译器如何知道一个虚表的长度呢?答案是,虚表的最后一个元素为NULL.稍微修改程序如下:



程序 12.

#include <iostream>
using namespace std;

class Class {
        virtual void f() { cout << "Class::f" << endl; }
        virtual void g() { cout << "Class::g" << endl; }
};

int main() {
        Class objClass;

        cout << "Address of virtual pointer " << (int*)(&objClass+0) << endl;
        cout << "Value at virtual pointer i.e. Address of virtual table " 
                 << (int*)*(int*)(&objClass+0) << endl;

        cout << endl << "Information about VTable" << endl << endl;
        cout << "Value at 1st entry of VTable " 
                 << (int*)*((int*)*(int*)(&objClass+0)+0) << endl;
        cout << "Value at 2nd entry of VTable " 
                 << (int*)*((int*)*(int*)(&objClass+0)+1) << endl;
        cout << "Value at 3rd entry of VTable " 
                 << (int*)*((int*)*(int*)(&objClass+0)+2) << endl;
        cout << "Value at 4th entry of VTable " 
                 << (int*)*((int*)*(int*)(&objClass+0)+3) << endl;

        return 0;
}

输出结果如下:

Address of virtual pointer 0012FF7C
Value at virtual pointer i.e. Address of virtual table 0046C134

Information about VTable

Value at 1st entry of VTable 0040100A
Value at 2nd entry of VTable 0040129E
Value at 3rd entry of VTable 00000000
Value at 4th entry of VTable 73616C43
这个程序的输出结果显示虚表的最后一项为NULL,下面我们通过刚才的方式来调用虚函数:




程序 13.

#include <iostream>
using namespace std;

class Class {
        virtual void f() { cout << "Class::f" << endl; }
        virtual void g() { cout << "Class::g" << endl; }
};

typedef void(*Fun)(void);

int main() {
        Class objClass;

        Fun pFun = NULL;

        // calling 1st virtual function
        pFun = (Fun)*((int*)*(int*)(&objClass+0)+0);
        pFun();
        
        // calling 2nd virtual function
        pFun = (Fun)*((int*)*(int*)(&objClass+0)+1);
        pFun();

        return 0;
}
程序输出为:
Class::f
Class::g
现在让我们看看多重继承的情况。先看下面的简单例子:

Program 14.

 

#include <iostream>
using namespace std;

class Base1 {
public:
        virtual void f() { }
};

class Base2 {
public:
        virtual void f() { }
};

class Base3 {
public:
        virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

int main() {
        Drive objDrive;
        cout << "Size is = " << sizeof(objDrive) << endl;
        return 0;
}

输出结果如下:

Size is = 12
这个程序显示,当派生了继承了多个父类时,他将包含所有父类的虚指针。

那如果派生类也有虚函数,又是什么情况呢?为了更好地理解多重继承,再看看下面的程序:

程序 15.

#include <iostream>
using namespace std;

class Base1 {
        virtual void f() { cout << "Base1::f" << endl; }
        virtual void g() { cout << "Base1::g" << endl; }
};

class Base2 {
        virtual void f() { cout << "Base2::f" << endl; }
        virtual void g() { cout << "Base2::g" << endl; }
};

class Base3 {
        virtual void f() { cout << "Base3::f" << endl; }
        virtual void g() { cout << "Base3::g" << endl; }
};

class Drive : public Base1, public Base2, public Base3 {
public:
        virtual void fd() { cout << "Drive::fd" << endl; }
        virtual void gd() { cout << "Drive::gd" << endl; }
};

typedef void(*Fun)(void);

int main() {
        Drive objDrive;

        Fun pFun = NULL;

        // 调用Base1的第一个虚函数
        pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+0);
        pFun();
        
        // 调用Base1的第二个虚函数
        pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+1);
        pFun();

        // 调用Base2的第一个虚函数
        pFun = (Fun)*((int*)*(int*)((int*)&objDrive+1)+0);
        pFun();

        // 调用Base2的第二个虚函数
        pFun = (Fun)*((int*)*(int*)((int*)&objDrive+1)+1);
        pFun();

        // 调用Base3的第一个虚函数
        pFun = (Fun)*((int*)*(int*)((int*)&objDrive+2)+0);
        pFun();

        // 调用Base3的第二个虚函数
        pFun = (Fun)*((int*)*(int*)((int*)&objDrive+2)+1);
        pFun();

        // 调用派生类的第一个虚函数
        pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+2);
        pFun();

        // 调用派生了的第二个虚函数
        pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+3);
        pFun();

        return 0;
}

程序输出结果如下:

Base1::f
Base1::g
Base2::f
Base2::f
Base3::f
Base3::f
Drive::fd
Drive::gd



我们可以通过static_cast获取派生类中基类虚指针的偏移量。让我们看看下面的案例:


程序 16.

#include <iostream>
using namespace std;

class Base1 {
public:
        virtual void f() { }
};

class Base2 {
public:
        virtual void f() { }
};

class Base3 {
public:
        virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

// 任何非零数,因为任何非零数乘以0等于0
#define SOME_VALUE      1

int main() {
        cout << (DWORD)static_cast<Base1*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
        cout << (DWORD)static_cast<Base2*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
        cout << (DWORD)static_cast<Base3*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
        return 0;
}
ATLATLDEF.H中定义了宏offsetofclass,这个宏返回了派生类模型中基类的虚表地址。
#define offsetofclass(base, derived) \
       ((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)
//以下这段为自己加注的
在用这个宏之前首先解释一下上面的这个程序。由程序15可以判断出,基类的虚指针分别存在派生类的最开头,而且都占用4个字节,在这里Base1的虚表地址位于起始地址0处,Base2的虚表地址位于起始地址偏移4个字节处,Base3的虚表地址位于起始地址偏移8个字节处。下面我们看这个关键运算:
(DWORD)static_cast<Base1*>((Drive*)SOME_VALUE)-SOME_VALUE
首先:(Drive*)SOME_VALUE将1强制转换为了一个地址,0x00000001(这个位数有计算机的地址空间决定)。
(DWORD)static_cast<Base1*>((Drive*)SOME_VALUE)将0x00000001这个地址进行static_cast转换,由于Base1本身位于Drive的起始位置,古转换后的结果仍然不变。最后将0x00000001减掉,便是Base1的偏移量0。同样的道理,对Base2的static_cast转换将得到原地址偏移4的地址,最后减掉源地址,便是偏移量,Base3类似。
//以上这段为自己加注的
接下来让我们再看一个例子。

程序 17.

#include <windows.h>
#include <iostream>
using namespace std;

class Base1 {
public:
        virtual void f() { }
};

class Base2 {
public:
        virtual void f() { }
};

class Base3 {
public:
        virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

#define _ATL_PACKING 8

#define offsetofclass(base, derived) \
        ((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)

int main() {
        cout << offsetofclass(Base1, Drive) << endl;
        cout << offsetofclass(Base2, Drive) << endl;
        cout << offsetofclass(Base3, Drive) << endl;
        return 0;
}


这个程序输出为:

0
4
8
这个程序输出显示这个宏返回了特定基类的虚指针在派生类的偏移量。在DonBoxEssential COM一书中,也用了类似的宏来实现这个功能。我们用Box的宏来替换ATL的宏,对程序做少许改动:

程序 18.

#include <windows.h>
#include <iostream>
using namespace std;

class Base1 {
public:
        virtual void f() { }
};

class Base2 {
public:
        virtual void f() { }
};

class Base3 {
public:
        virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

#define BASE_OFFSET(ClassName, BaseName) \
        (DWORD(static_cast<BaseName*>(reinterpret_cast<ClassName*>\
        (0x10000000))) - 0x10000000)

int main() {
        cout << BASE_OFFSET(Drive, Base1) << endl;
        cout << BASE_OFFSET(Drive, Base2) << endl;
        cout << BASE_OFFSET(Drive, Base3) << endl;
        return 0;
}
这个程序的目的以及结果和前面的程序完全一致。
让我们在自己的程序中使用宏。实际上,我们可以通过查看派生类的内存结构,从而获取派生类中基类虚指针的偏移量,再利用这个偏移量来调用特定基类的虚函数。

程序 19.

#include <windows.h>
#include <iostream>
using namespace std;

class Base1 {
public:
        virtual void f() { cout << "Base1::f()" << endl; }
};

class Base2 {
public:
        virtual void f() { cout << "Base2::f()" << endl; }
};

class Base3 {
public:
        virtual void f() { cout << "Base3::f()" << endl; }
};

class Drive : public Base1, public Base2, public Base3 {
};

#define _ATL_PACKING 8

#define offsetofclass(base, derived) \
        ((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)

int main() {
        Drive d;

        void* pVoid = NULL;

        // call function of Base1
        pVoid = (char*)&d + offsetofclass(Base1, Drive);
        ((Base1*)(pVoid))->f();

        // call function of Base2
        pVoid = (char*)&d + offsetofclass(Base2, Drive);
        ((Base2*)(pVoid))->f();

        // call function of Base1
        pVoid = (char*)&d + offsetofclass(Base3, Drive);
        ((Base3*)(pVoid))->f();

        return 0;
}

这个程序的输出为:

Base1::f()
Base2::f()
Base3::f(
 
本部分完。
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章