Kylin Cube Build的接口說明
- 每一個Cube需要設置數據源、計算引擎和存儲引擎,工廠類負責創建數據源對象、計算引擎對象和存儲引擎對象
- 三者之間通過適配器進行串聯
數據源接口(ISource)
public interface ISource extends Closeable {
// 同步數據源中表的元數據信息
ISourceMetadataExplorer getSourceMetadataExplorer();
// 適配製定的構建引擎接口
<I> I adaptToBuildEngine(Class<I> engineInterface);
// 順序讀取表
IReadableTable createReadableTable(TableDesc tableDesc);
// 構建之前豐富數據源的Partition
SourcePartition enrichSourcePartitionBeforeBuild(IBuildable buildable, SourcePartition srcPartition);
}
存儲引擎接口(IStorage)
public interface IStorage {
// 創建一個查詢指定Cube的對象
public IStorageQuery createQuery(IRealization realization);
public <I> I adaptToBuildEngine(Class<I> engineInterface);
}
計算引擎接口(IBatchCubingEngine)
public interface IBatchCubingEngine {
public IJoinedFlatTableDesc getJoinedFlatTableDesc(CubeSegment newSegment);
// 返回一個工作流計劃, 用以構建指定的CubeSegment
public DefaultChainedExecutable createBatchCubingJob(CubeSegment newSegment, String submitter);
// 返回一個工作流計劃, 用以合併指定的CubeSegment
public DefaultChainedExecutable createBatchMergeJob(CubeSegment mergeSegment, String submitter);
// 返回一個工作流計劃, 用以優化指定的CubeSegment
public DefaultChainedExecutable createBatchOptimizeJob(CubeSegment optimizeSegment, String submitter);
public Class<?> getSourceInterface();
public Class<?> getStorageInterface();
}
離線Cube Build 調用鏈
- Rest API請求
/{cubeName}/rebuild
, 調用CubeController.rebuild()
方法 ->jobService.submitJob()
- Project級別的權限校驗:
aclEvaluate.checkProjectOperationPermission(cube);
ISource source = SourceManager.getSource(cube)
根據CubeInstance的方法getSourceType()
的返回值決定ISource的對象類型public int getSourceType() { return getModel().getRootFactTable().getTableDesc().getSourceType(); }
分配新的segment:
newSeg = getCubeManager().appendSegment(cube, src);
EngineFactory根據Cube定義的engine type, 創建對應的
IBatchCubingEngine
對象 -> 調用createBatchCubingJob()
方法創建作業鏈,MRBatchCubingEngine2
新建的是BatchCubingJobBuilder2
public BatchCubingJobBuilder2(CubeSegment newSegment, String submitter) { super(newSegment, submitter); this.inputSide = MRUtil.getBatchCubingInputSide(seg); this.outputSide = MRUtil.getBatchCubingOutputSide2(seg); }
適配輸入數據源到構建引擎
SourceManager.createEngineAdapter(seg, IMRInput.class).getBatchCubingInputSide(flatDesc); public static <T> T createEngineAdapter(ISourceAware table, Class<T> engineInterface) { return getSource(table).adaptToBuildEngine(engineInterface); } // HiveSource返回的是HiveMRInput public <I> I adaptToBuildEngine(Class<I> engineInterface) { if (engineInterface == IMRInput.class) { return (I) new HiveMRInput(); } else { throw new RuntimeException("Cannot adapt to " + engineInterface); } }
適配存儲引擎到構建引擎
StorageFactory.createEngineAdapter(seg, IMROutput2.class).getBatchCubingOutputSide(seg); public static <T> T createEngineAdapter(IStorageAware aware, Class<T> engineInterface) { return storage(aware).adaptToBuildEngine(engineInterface); } // HBaseStorage返回的是HBaseMROutput2Transition public <I> I adaptToBuildEngine(Class<I> engineInterface) { if (engineInterface == IMROutput2.class) { return (I) new HBaseMROutput2Transition(); } else { throw new RuntimeException("Cannot adapt to " + engineInterface); } }
- 適配成功後,
new BatchCubingJobBuilder2(newSegment, submitter).build()
該方法創建具體的執行步驟, 形成工作流 - 將工作流添加到執行管理器,等待調度執行:
getExecutableManager().addJob(job);