FlatBuffers發佈時,順便也公佈了它的性能數據,具體數據請見Benchmark。
它的測試用例由以下數據構成"a set of about 10 objects containing an array, 4 strings, and a large variety of int/float scalar values of all sizes, meant to be representative of game data, e.g.
a scene format."
我感覺這樣測試如同兒戲,便自己設計了一個測試用例,主要關注CPU計算時間和內存空間佔用兩個指標,參考對象是protobuf。
測試用例爲:序列化一個通訊錄personal_info_list(table),通訊錄可以認爲是有每個人的信息(personal_info)的集合。每個人信息personal_info(table)有:個人id(uint)、名字(string)、年齡(byte)、性別(enum, byte)和電話號碼(ulong)。本來我想用struct表示personal_info(table),但是struct不允許有數組或string成員,無奈我用table描述它了。相應的idl文件如下:
-
-
-
-
-
-
-
-
-
namespace as.tellist;
-
-
enum GENDER_TYPE : byte
-
{
-
MALE = 0,
-
FEMALE = 1,
-
OTHER = 2
-
}
-
-
table personal_info
-
{
-
id : uint;
-
name : string;
-
age : byte;
-
gender : GENDER_TYPE;
-
phone_num : ulong;
-
}
-
-
table personal_info_list
-
{
-
info : [personal_info];
-
}
-
-
root_type personal_info_list;
因爲要以protobuf做性能參考,列出protobuf的idl文件如下:
-
-
-
-
-
-
-
-
-
package as.tellist;
-
-
enum gender_type
-
{
-
MALE = 0;
-
FEMALE = 1;
-
OTHER = 2;
-
}
-
-
message personal_info
-
{
-
optional uint32 id = 1;
-
optional string name = 2;
-
optional uint32 age = 3;
-
optional gender_type gender = 4;
-
optional uint64 phone_num = 5;
-
}
-
-
message personal_info_list
-
{
-
repeated personal_info info = 1;
-
}
若用C的struct描述對應的頭文件(其對應的程序稱之爲“二進制”),如下:
-
-
-
-
-
-
-
-
-
-
#ifndef __TELLIST_H__
-
#define __TELLIST_H__
-
-
enum
-
{
-
GENDER_TYPE_MALE = 0,
-
GENDER_TYPE_FEMALE = 1,
-
GENDER_TYPE_OTHER = 2,
-
};
-
-
-
inline const char **EnumNamesGENDER_TYPE()
-
{
-
static const char *names[] = { "MALE", "FEMALE", "OTHER"};
-
return names;
-
}
-
-
-
inline const char *EnumNameGENDER_TYPE(int e)
-
{
-
return EnumNamesGENDER_TYPE()[e];
-
}
-
-
typedef struct personal_info_tag
-
{
-
unsigned id;
-
unsigned char age;
-
char gender;
-
unsigned long long phone_num;
-
char name[32];
-
} personal_info;
-
-
typedef struct personal_info_list_tag
-
{
-
int size;
-
personal_info info[0];
-
} personal_info_list;
-
-
#endif
-
-
測試時,在內存中構造37個personal_info對象,並序列化之,重複這個過程100萬次,然後再進行反序列化,再重複100萬次。
測試結果如下(補充:tellist_pb是protobuf測試程序,tellist_fb是FlatBuffers測試程序,tellist_fb是二進制測試程序,):
-
測試環境:12Core Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz
-
free
-
total used free shared buffers cached
-
Mem: 66081944 62222028 3859916 0 196448 43690828
-
-/+ buffers/cache: 18334752 47747192
-
Swap: 975864 855380 120484
-
-
protobuf三次測試結果:
-
bin/tellist_pb
-
encode: loop = 1000000, time diff = 14210ms
-
decode: loop = 1000000, time diff = 11185ms
-
buf size:841
-
-
bin/tellist_pb
-
encode: loop = 1000000, time diff = 14100ms
-
decode: loop = 1000000, time diff = 11234ms
-
buf size:841
-
-
bin/tellist_pb
-
encode: loop = 1000000, time diff = 14145ms
-
decode: loop = 1000000, time diff = 11237ms
-
buf size:841
-
序列化後佔用內存空間841Byte,encode平均運算時間42455ms / 3 = 14151.7ms,decode平均計算時間33656ms / 3 = 11218.7ms
-
-
flatbuffers三次測試結果:
-
bin/tellist_fb
-
encode: loop = 1000000, time diff = 11666ms
-
decode: loop = 1000000, time diff = 1141ms
-
buf size:1712
-
-
bin/tellist_fb
-
encode: loop = 1000000, time diff = 11539ms
-
decode: loop = 1000000, time diff = 1200ms
-
buf size:1712
-
-
bin/tellist_fb
-
encode: loop = 1000000, time diff = 11737ms
-
decode: loop = 1000000, time diff = 1141ms
-
buf size:1712
-
序列化後佔用內存空間1712Byte,encode平均運算時間34942ms / 3 = 11647.3ms,decode平均計算時間3482ms / 3 = 1160.7ms
-
-
二進制三次測試結果:
-
bin/tellist
-
encode: loop = 1000000, time diff = 4967ms
-
decode: loop = 1000000, time diff = 688ms
-
buf size:304
-
-
bin/tellist
-
encode: loop = 1000000, time diff = 4971ms
-
decode: loop = 1000000, time diff = 687ms
-
buf size:304
-
-
bin/tellist
-
encode: loop = 1000000, time diff = 4966ms
-
decode: loop = 1000000, time diff = 686ms
-
buf size:304
-
序列化後佔用內存空間304Byte,encode平均運算時間14904ms / 3 = 4968ms,decode平均計算時間2061ms / 3 = 687ms
-
-
測試環境:1 Core Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
-
free
-
total used free shared buffers cached
-
Mem: 753932 356036 397896 0 50484 224848
-
-/+ buffers/cache: 80704 673228
-
Swap: 1324028 344 1323684
-
protobuf三次測試結果:
-
./bin/tellist_pb
-
encode: loop = 1000000, time diff = 12451ms
-
decode: loop = 1000000, time diff = 9662ms
-
buf size:841
-
-
./bin/tellist_pb
-
encode: loop = 1000000, time diff = 12545ms
-
decode: loop = 1000000, time diff = 9840ms
-
buf size:841
-
-
./bin/tellist_pb
-
encode: loop = 1000000, time diff = 12554ms
-
decode: loop = 1000000, time diff = 10460ms
-
buf size:841
-
序列化後佔用內存空間841Byte,encode平均運算時間37550ms / 3 = 12516.7ms,decode平均計算時間29962ms / 3 = 9987.3ms
-
-
flatbuffers三次測試結果:
-
bin/tellist_fb
-
encode: loop = 1000000, time diff = 9640ms
-
decode: loop = 1000000, time diff = 1164ms
-
buf size:1712
-
-
bin/tellist_fb
-
encode: loop = 1000000, time diff = 9595ms
-
decode: loop = 1000000, time diff = 1170ms
-
buf size:1712
-
-
bin/tellist_fb
-
encode: loop = 1000000, time diff = 9570ms
-
decode: loop = 1000000, time diff = 1172ms
-
buf size:1712
-
序列化後佔用內存空間1712Byte,encode平均運算時間28805ms / 3 = 9345ms,decode平均計算時間3506ms / 3 = 1168.7ms
-
-
二進制三次測試結果:
-
bin/tellist
-
encode: loop = 1000000, time diff = 4194ms
-
decode: loop = 1000000, time diff = 538ms
-
buf size:304
-
-
bin/tellist
-
encode: loop = 1000000, time diff = 4387ms
-
decode: loop = 1000000, time diff = 544ms
-
buf size:304
-
-
bin/tellist
-
encode: loop = 1000000, time diff = 4181ms
-
decode: loop = 1000000, time diff = 533ms
-
buf size:304
-
序列化後佔用內存空間304Byte,encode平均運算時間12762ms / 3 = 4254ms,decode平均計算時間1615ms / 3 = 538.3ms
上面的二進制程序的結果無論在內存空間佔用還是cpu計算時間這兩個指標上都是最快的。但本文只討論FlatBuffers和protobuf,所以不讓它的結果參與比較。
從以上數據看出,在內存空間佔用這個指標上,FlatBuffers佔用的內存空間比protobuf多了兩倍。序列化時二者的cpu計算時間FB比PB快了3000ms左右,反序列化時二者的cpu計算時間FB比PB快了9000ms左右。FB在計算時間上佔優勢,而PB則在內存空間上佔優(相比FB,這也正是它計算時間比較慢的原因)。
上面的測試環境是在公司的linux server端和我自己的mac pro分別進行的。請手機端開發者自己也在手機端進行下測試, 應該能得到類似的結果。Google宣稱FB適合遊戲開發是有道理的,如果在乎計算時間我想它也適用於後臺開發。
另外,FB大量使用了C++11的語法,其從idl生成的代碼接口不如protubuf友好。不過相比使用protobuf時的一堆頭文件和佔18M之多的lib庫,FlatBuffers僅僅一個"flatbuffers/flatbuffers.h"就足夠了。
測試程序已經上傳到百度網盤,點擊這個鏈接即可下載。歡迎各位的批評意見。