HPCC初探

HPCC(High Performance Computing Cluster,高性能計算集羣)是開源的大規模並行處理計算平臺,主要用於解決Big Data問題。

HPCC集成了 Thor (the Data Refinery Cluster)集羣與Roxie(the Query Cluster) 集羣作爲其中間件,包括外部通信層(以客戶端接口提供終端服務和系統管理工具)與輔助組件(支持監控和從外部數據源加載存儲文件系統數據)。

The HPCC Systems architecture incorporates the Thor and Roxie clusters as well as common middleware components, an external communications layer, client interfaces which provide both end-user services and system management tools, and auxiliary components to support monitoring and to facilitate loading and storing of filesystem data from external sources. An HPCC environment can include only Thor clusters, or both Thor and Roxie clusters. Each of these cluster types is described in more detail in the following sections below the architecture diagram.

High-Level HPCC Architecture



該圖從高層描述了平臺架構與組件協作原理。
The diagram above illustrates a high level overview of the platform architecture and how the components all work together as a powerful solution for managing Big Data. A brief description on each component is detailed below.
Thor負責讀取大規模數據,然後轉換,連接以及對數據進行檢索。Thor的功能類似分佈式文件系統,具有多個節點並行處理能力,支持可伸縮。


Thor (the Data Refinery Cluster) is responsible for consuming vast amounts of data, transforming, linking and indexing that data. It functions as a distributed file system with parallel processing power spread across the nodes. A cluster can scale from a single node to thousands of nodes.


Single-threaded
Distributed parallel processing
Distributed file system
Powerful parallel processing programming language (ECL)
Optimized for Extraction, Transformation, Loading, Sorting, Indexing and Linking
Scales from 1-1000s of nodes
Roxie提供了獨立的高性能在線查詢處理與數據倉庫能力。


Roxie (the Query Cluster) provides separate high-performance online query processing and data warehouse capabilities.


Multi-threaded
Distributed parallel processing
Distributed file system
Powerful parallel processing programming language (ECL)
Optimized for concurrent query processing
Scales from 1-1000s of nodes
ECL IDE是集成開發工具,具有編碼,調試與監控ECL程序的功能。


ECL IDE is a modern IDE used to code, debug and monitor ECL programs.


Access to shared source code repositories
Complete development, debugging and testing environment for developing ECL dataflow programs
Access to the ECLWatch tool is built-in, allowing developers to watch job graphs as they are executing
Access to current and historical job workunits
ESP,企業服務平臺提供基於XML,HTTP, SOAP and REST的ECL訪問接口
ESP (Enterprise Services Platform) provides an easy to use interface to access ECL queries using XML, HTTP, SOAP and REST.


Standards-based interface to access ECL functions
Supports SOAP, XML, HTTP and REST
Supports SAML and various security standards


發佈了42 篇原創文章 · 獲贊 9 · 訪問量 12萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章