HBASE多行轉列冷數據處理

背景及介紹詳見:供應鏈冷熱數據處理實踐
冷數據讀寫方案
方案主要包含讀寫兩塊,寫是第一步,因爲PG是行數據,而HBASE是列數據,而且在做數據同步的同時,還要考慮怎樣的數據結構可以方便數據讀取,不僅僅如此,因爲一個主單會對應多條明細,而查詢的時候需要根據單號撈取相關的明細數據,最終通過不斷嘗試,梳理出以下可執行方案
數據轉換
行轉列的時候,PG的一行數據會轉成HBASE的多條數據,因爲查詢上只需要支持order_no作爲查詢條件,直接以order_no作爲rowkey,以爲查詢的數據需要將每條明細轉換成一個對象(object),這裏面涉及到兩個維度的匹配,一個是這麼多數據如何哪些數據是同一條數據,另一個是這個數據對應的是哪條數據的哪個屬性,我們的做法是,已PG數據的列作爲HBASE的列簇,PG明細數據的唯一標識作爲列的別名,比如以資產碼作爲別名,開始動手

  • 建表
create 't_operate_order_inventory_list', {NAME => 'inventory_sn', VERSIONS => 1}, {NAME => 'factory_sn', VERSIONS => 1}, {NAME => 'package_sn', VERSIONS => 1}, {NAME => 'inventory_spu', VERSIONS => 1}, {NAME => 'inventory_sku', VERSIONS => 1}, {NAME => 'inventory_status', VERSIONS => 1}, {NAME => 'is_deleted', VERSIONS => 1}, {NAME => 'inventory_ext', VERSIONS => 1}
  • 寫入數據
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'inventory_sn:8600021356', '8600021356'
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'factory_sn:8600021356', 'LS36071S25W4355'
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'package_sn:8600021356', 'ZA00000097'
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'inventory_spu:8600021356', 'BAT1011'
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'inventory_sku:8600021356', 'BAT101110'
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'inventory_status:8600021356', '0'
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'is_deleted:8600021356', '0'
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'inventory_ext:8600021356', ''

讀取的時候通過order_no或者明細數據,然後根據列和列簇組裝明細list,代碼實現如下

public List<OperateOrderInventoryVO> queryInventory(InventoryQueryReq req) {
    String orderNo = req.getOrderNo();
    if (StringUtils.isEmpty(orderNo)) {
        throw new ServiceRuntimeException(Protos.createBadRequest("orderNo"));
    }

    // date check
    Date orderDate = DateUtil.formatDate(orderNo.substring(3, 11), "yyyyMMdd");
    if (DateUtil.calIntervalDay(System.currentTimeMillis(), orderDate.getTime()) > CommonConstant.DEFAULT_DATA_TIME_OPERATE_ORDER_INVENTORY_LIST) {
        // get from HBASE
        Get get = new Get(orderNo.getBytes());
        Result result = table.get(get);
        Cell[] cellList = result.rawCells();
        Map<String, OperateOrderInventoryVO> inventoryListMap = new HashMap<>();
        for (Cell cell : cellList) {
            OperateOrderInventoryVO list = null;
            String id = new String(CellUtil.cloneQualifier(cell));
            if (null == inventoryListMap.get(id)) {
                list = new OperateOrderInventoryVO();
            } else {
                list = inventoryListMap.get(id);
            }
            switch (new String(CellUtil.cloneFamily(cell))) {
                case "factory_sn":
                    list.setFactorySn(new String(CellUtil.cloneValue(cell)));
                    break;
                case "inventory_sn":
                    list.setInventorySn(new String(CellUtil.cloneValue(cell)));
                    break;
                case "inventory_spu":
                    list.setInventorySpu(new String(CellUtil.cloneValue(cell)));
                    break;
                case "inventory_sku":
                    list.setInventorySku(new String(CellUtil.cloneValue(cell)));
                    break;
                case "package_sn":
                    list.setPackageSn(new String(CellUtil.cloneValue(cell)));
                    break;
                case "inventory_status":
                    list.setStatus(Integer.parseInt(new String(CellUtil.cloneValue(cell))));
                    break;
                case "is_deleted":
                    list.setIsDeleted(Integer.parseInt(new String(CellUtil.cloneValue(cell))));
                    break;
                case "inventory_ext":
                    list.setExt(new String(CellUtil.cloneValue(cell)));
                    break;
            }
            inventoryListMap.put(id, list);
        }
        // filter
        List<OperateOrderInventoryVO> data = new ArrayList<>(inventoryListMap.values());
        if (!CollectionUtils.isEmpty(data)) {
            if (!StringUtils.isEmpty(req.getInventorySpu())) {
                data = data.stream().filter(i -> req.getInventorySpu().equals(i.getInventorySpu())).collect(Collectors.toList());
            }
            if (!StringUtils.isEmpty(req.getInventorySku())) {
                data = data.stream().filter(i -> req.getInventorySku().equals(i.getInventorySku())).collect(Collectors.toList());
            }
            if (!StringUtils.isEmpty(req.getPackageSn())) {
                data = data.stream().filter(i -> req.getPackageSn().equals(i.getPackageSn())).collect(Collectors.toList());
            }
        }
        return data;
    }

    // get from PG
    OperateOrderInventoryList query = new OperateOrderInventoryList();
    query.setRefSn(orderNo);
    query.setIsDeleted(DeleteEnum.EXISTED.value());
    query.setInventorySpu(req.getInventorySpu());
    query.setInventorySku(req.getInventorySku());
    query.setPackageSn(req.getPackageSn());
    List<OperateOrderInventoryList> inventoryList = inventoryListMapper.query(query);
    List<OperateOrderInventoryVO> inventoryVOList = new ArrayList<>();
    if (!CollectionUtils.isEmpty(inventoryList)) {
        inventoryList.forEach(i -> {
            OperateOrderInventoryVO v = new OperateOrderInventoryVO();
            BeanUtils.copyProperties(i, v);
            v.setOrderNo(i.getRefSn());
            inventoryVOList.add(v);
        });
    }
    return inventoryVOList;
}
  • 注意
    這種方案的前提是冷數據的訪問量不大或者一次get的數據量不大,爲了更好的利益HBASE,我們以order_no作爲rowkey,這樣可以直接命中數據,但如果一個單子明細很多的時候,因爲所有明細數據都是在內存中處理的,會很耗內存,如果訪問量很大或者一次get很多數據的話可能會出現OOM,請做好相應處理措施(比如限流)
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章