Atlas源碼解讀（1）圖數據庫JanusGraph

原創

king_eagle2015

2019-09-04 17:19

Aparche Atlas是Hadoop數據治理與元數據框架，提供了高效數據查詢與分類系統，同時支持數據審計與數據血緣關係的建立。小編認爲隨大數據發展，數據結構與類型將越來越複雜，元數據治理與數據關係建立將是完成數據挖掘，實現人工智能的重要前提。

Atlas收集大數據組件元數據通過集成構件kafka傳輸至底層數據處理構件JanusGraph，JanusGraph作爲數據處理核心，爲數據存儲，關係建立，數據血緣建立，數據查詢提供便利。

【JanusGraph基礎】

圖數據庫適用於大數據複雜關係分析的業務場景，關係型數據庫通過join的方式進行數據處理，降低數據處理效率；圖數據庫採用有向圖的數據結構，並以索引的方式進行數據檢索，提高數據處理效率。JaunusGraph是爲數不多的支持分佈式的圖數據庫，適應於大數據發展趨勢，可能正是以爲如此，所以Atlas選擇採用JanusGraph進行數據治理。

關於圖數據庫的基本使用與介紹可以參考胡佳輝寫的《圖數據庫 JanusGraph 實戰》

JanusGraph使用的可視化工具：gephi

搭建：https://www.codercto.com/a/50031.html

使用：https://www.jianshu.com/p/dbb63cdced90

官網：https://gephi.org/tutorials/gephi-tutorial-quick_start.pdf

【Atlas封裝JanusGraph】

所有關於圖數據庫的操作均位於atlas-graphdb子工程下，我們看下該自工程下所有內容：

api定義構成圖數據庫的基本元素接口，包括屬性鍵，邊，邊方向，邊標籤定義，屬性元素定義，圖，索引，圖管理器（包括事務處理，屬性鍵管理，索引定義），圖查詢器，遍歷器，索引查詢器，屬性鍵，頂點，定點查詢器；common定義查詢語義，例如and，has，in，or，orderby；janus是對元素定義，圖數據庫查詢，數據遷徙的具體實現。

圖數據庫建立流程如下：

1. 配置文件配置圖數據庫的數據存儲位置和索引存儲位置：

atlas.graphdb.backend=org.apache.atlas.repository.graphdb.janus.AtlasJanusGraphDatabase
# Graph Storage
atlas.graph.storage.backend=hbase
atlas.graph.storage.port=2181
atlas.graph.storage.hbase.table=atlas-test
atlas.graph.storage.hostname=docker2,docker3,docker4

# Graph Search Index Backend
atlas.graph.index.search.backend=elasticsearch
atlas.graph.index.search.hostname=127.0.0.1
atlas.graph.index.search.index-name=atlas_test

2. 初始化圖數據庫與數據庫管理器：

 public static JanusGraph getGraphInstance() {
        if (graphInstance == null) {
            synchronized (AtlasJanusGraphDatabase.class) {
                if (graphInstance == null) {
                    Configuration config;
                    try {
                        config = getConfiguration();
                    } catch (AtlasException e) {
                        throw new RuntimeException(e);
                    }

                    try {
                        graphInstance = JanusGraphFactory.open(config);
                    } catch (JanusGraphException e) {
                        LOG.warn("JanusGraphException: {}", e.getMessage());
                        if (e.getMessage().startsWith(OLDER_STORAGE_EXCEPTION)) {
                            LOG.info("Newer client is being used with older janus storage version. Setting allow-upgrade=true and reattempting connection");
                            config.addProperty("graph.allow-upgrade", true);
                            graphInstance = JanusGraphFactory.open(config);
                        }
                        else {
                            throw new RuntimeException(e);
                        }
                    }
                    atlasGraphInstance = new AtlasJanusGraph();
                    validateIndexBackend(config);
                }
            }
        }
        return graphInstance;
    }

3. 根據業務需要定義schema，初始化頂點標籤，邊標籤，屬性鍵和索引等

        AtlasJanusGraphDatabase db = new AtlasJanusGraphDatabase();
        AtlasGraphManagement mgmt = db.getGraph().getManagementSystem();

        if (mgmt.getGraphIndex(BACKING_INDEX_NAME) == null) {
            //提前定義定點混合索引
            mgmt.createVertexMixedIndex(BACKING_INDEX_NAME, Constants.BACKING_INDEX, Collections.emptyList());
        }
        //定義屬性鍵age13
        mgmt.makePropertyKey("age13", Integer.class, AtlasCardinality.SINGLE);
        //定義屬性鍵和索引
        createIndices(mgmt, "name", String.class, false, AtlasCardinality.SINGLE);
        createIndices(mgmt, WEIGHT_PROPERTY, Integer.class, false, AtlasCardinality.SINGLE);
        createIndices(mgmt, "size15", String.class, false, AtlasCardinality.SINGLE);
        createIndices(mgmt, "typeName", String.class, false, AtlasCardinality.SINGLE);
        createIndices(mgmt, "__type", String.class, false, AtlasCardinality.SINGLE);
        createIndices(mgmt, Constants.GUID_PROPERTY_KEY, String.class, true, AtlasCardinality.SINGLE);
        createIndices(mgmt, Constants.TRAIT_NAMES_PROPERTY_KEY, String.class, false, AtlasCardinality.SET);
        createIndices(mgmt, Constants.SUPER_TYPES_PROPERTY_KEY, String.class, false, AtlasCardinality.SET);
        mgmt.commit();

4. 創建頂點，屬性等：

        AtlasGraph<V, E> graph = getGraph();
        AtlasVertex<V, E> v1 = createVertex(graph);

        v1.setProperty("name", "Fred");
        v1.setProperty("size15", "15");

//createVertex方法：
         //使用JanusGraph建立頂點
         Vertex result = getGraph().addVertex();
         //封裝頂點和AtlasGraph
        return GraphDbObjectFactory.createVertex(this, result);

【總結】

圖數據庫的編程思想以頂點，邊，屬性出發，首先建立schema，類似於關係型數據庫對錶，字段的約束，邊標籤的多樣性約束一對定點間邊的個數，屬性鍵類型約束數據存儲格式，基數限制屬性存儲方式；數據存儲採用分佈式數據庫Hbase，Cassardan，並建立圖數據庫索引；數據查詢使用Gremlin進行關係遍歷。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Atlas源碼解讀（1）圖數據庫JanusGraph

【JanusGraph基礎】

【Atlas封裝JanusGraph】

【總結】

JVM--java類文件結構

ActiveMQ支持協議

spring源碼解讀（1）-容器基本實現

信息資源管理

小小加密應用

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結