gloox代碼分析2 - xml parser模塊

原創

2020-02-24 02:06

gloox自己實現了xml的解析模塊，沒有用到第三方的庫(tinyXML，expat )
主要涉及的文件:
tag.h (tag.cpp)
taghandler.h
parser.h (parser.cpp)

1. Tag一個Tag就是一個XML元素
例如:
a.
<book kind='computer'>
<store id='23'/>
<author>
    qiang
</author>
</book>
b. <book id='32'/>
c. <book>name1</book>

首先介紹一個概念: escape-string，何爲escape-string?
在escape-string中:
'&'轉換成&, '<'轉換成<, '>'轉換成>.
編碼表如下:
//////////////////////////////////////////////////////////////////////////
// 編碼表（中間的空格去掉，這裏只是爲了方便顯示）:
// -------------------------------------------------------
// | 字符     | 十進制 | 十六進制 | THML字符集 | Unicode |
// -------------------------------------------------------
// | " 雙引號 | & # 34; | & # x22;   | "          | /u0022 |
// -------------------------------------------------------
// | ' 單引號 | & # 39; | & # x27;   | & apos;     | /u0027 |
// -------------------------------------------------------
// | & 與     | & # 38; | & # x26;   | & amp;      | /u0026 |
// -------------------------------------------------------
// | < 小於號 | & # 60; | & # x3C;   | & lt;       | /u003c |
// -------------------------------------------------------
// | > 大於好 | & # 62; | & # x3E;   | & gt;       | /u003e |
// -------------------------------------------------------
gloox - APIs
Tag::escape()    功能: string -> escape-string
Tag::relax() 功能: escape-string -> string

主要成員變量:
attributes - 所有屬性的list
name - 節點名字
cdata - 節點數據，例如<name>cdata</name>中的cdata
children - 所有的子節點
parent - 父節點指針，如果沒有則爲空
bool incoming - 表示構造xml node的時候傳入的字符串是否是escape-string，如果是，需要在構造的時候調用relex把escape-string轉換成string.

主要方法:
也就是一些針對name/children/attributes/cdata進行增加/刪除/修改的方法
xml()方法返回該節點的一個完整的xml數據流
findTag和findTagList提供對XPath的支持.

例如:
屏幕將輸出:
<book kind='computer'><store id='23'/><author>qiang</author></book>

#include <iostream>
2

#include "tag.h"
3

#pragma comment( lib, "gloox.lib" )
5

using namespace gloox;
6

// <book kind='computer'>
8

// <store id='23'/>
9

// <author>
10

// qiang
11

// </author>
12

// </book>
13

//
14

int main( int argc, char* argv[] ) {
17

Tag* tag_book = new Tag( "book");
18

tag_book->addAttribute( "kind", "computer" );
19

Tag* tag_store = new Tag( "store" );
21

tag_store->addAttribute( "id", "32" );
22

Tag* tag_author = new Tag( "author", "qiang" );
24

tag_book->addChild( tag_store );
26

tag_book->addChild( tag_author );
27

std::cout<<tag_book->xml()<<std::endl;
29

return 0;
30

}

2. TagHandler是一個接收parser解析完成的tag的接口，繼承該類，則可以接收parser解析的tag對象事件.
只有一個接口
virtual void handleTag( Tag *tag ) = 0 - 接收解析完的tag

3. Parser一個XML解析器
提供的接口非常簡潔，只需要一個TagHandler來構造，該handler接收並處理解析的tag，另外只有一個feed接口來填充數據.
要注意的是feed接口填充的數據必須是一個格式正確的xml，否則無法解析，也就是說parser不會判斷xml的格式。

例如:
下面的例子中對feed來說分開填充和一次性填充數據的效果是一樣的，也就是scenario1和scenario2的效果是一樣的，這也剛好和上層應用中 TCP 流處理的方式統一，對於接收到服務器端的XML流，無論是否完整，只需要直接feed就可以了。handlerTag方法將收到兩個
xml tag解析完成的事件，分別來自scenario1和scenario2，屏幕將輸出:
<book kind='computer'><store id='23'/><author>qiang</author></book>
<book kind='computer'><store id='23'/><author>qiang</author></book>

#include <iostream>
2

#include "tag.h"
3

#include "parser.h"
4

#pragma comment( lib, "gloox.lib" )
6

using namespace gloox;
7

// <book kind='computer'>
9

// <store id='23'/>
10

// <author>
11

// qiang
12

// </author>
13

// </book>
14

//
15

//
16

class TagHandlerImpl : public TagHandler {
18

public:
19

~TagHandlerImpl() {}
20

void run() {
22

Parser* parser = new Parser(this);
23

// scenario1
24

std::string data = "<book kind='computer'><store id='23'/><author>qiang</author></book>";
25

parser->feed( data );
26

// scenario2
28

std::string data1 = "<book kind='computer";
29

std::string data2 = "'><store id='23'/><auth";
30

std::string data3 = "or>qiang</author></book>";
31

parser->feed( data1 );
32

parser->feed( data2 );
33

parser->feed( data3 );
34

}
35

void handleTag( Tag *tag ) {
37

std::cout<<tag->xml()<<std::endl;
38

}
39

};
40

int main( int argc, char* argv[] ) {
42

TagHandlerImpl* taghandlerImpl = new TagHandlerImpl();
43

taghandlerImpl->run();
44

return 0;
46

}

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

gloox代碼分析2 - xml parser模塊

MySQL 核心模塊揭祕 | 18 期 | 鎖在內存里長什麼樣*

使用perf工具生成火焰圖

HttpSecurity 是如何組裝過濾器鏈的

數說海南——近6年海南各市縣人口簡單看

長序列中Transformers的高級注意力機制總結

響應式界面控件DevExtreme * 更強的數據分析和可視化功能

Gloox發送消息

理解C++成員函數指針

利用SMTP發送Mail詳解(二)

XMPP協議內容

gloox連接至服務器端

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結