HBASE多行转列冷数据处理

背景及介绍详见:供应链冷热数据处理实践
冷数据读写方案
方案主要包含读写两块,写是第一步,因为PG是行数据,而HBASE是列数据,而且在做数据同步的同时,还要考虑怎样的数据结构可以方便数据读取,不仅仅如此,因为一个主单会对应多条明细,而查询的时候需要根据单号捞取相关的明细数据,最终通过不断尝试,梳理出以下可执行方案
数据转换
行转列的时候,PG的一行数据会转成HBASE的多条数据,因为查询上只需要支持order_no作为查询条件,直接以order_no作为rowkey,以为查询的数据需要将每条明细转换成一个对象(object),这里面涉及到两个维度的匹配,一个是这么多数据如何哪些数据是同一条数据,另一个是这个数据对应的是哪条数据的哪个属性,我们的做法是,已PG数据的列作为HBASE的列簇,PG明细数据的唯一标识作为列的别名,比如以资产码作为别名,开始动手

  • 建表
create 't_operate_order_inventory_list', {NAME => 'inventory_sn', VERSIONS => 1}, {NAME => 'factory_sn', VERSIONS => 1}, {NAME => 'package_sn', VERSIONS => 1}, {NAME => 'inventory_spu', VERSIONS => 1}, {NAME => 'inventory_sku', VERSIONS => 1}, {NAME => 'inventory_status', VERSIONS => 1}, {NAME => 'is_deleted', VERSIONS => 1}, {NAME => 'inventory_ext', VERSIONS => 1}
  • 写入数据
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'inventory_sn:8600021356', '8600021356'
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'factory_sn:8600021356', 'LS36071S25W4355'
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'package_sn:8600021356', 'ZA00000097'
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'inventory_spu:8600021356', 'BAT1011'
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'inventory_sku:8600021356', 'BAT101110'
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'inventory_status:8600021356', '0'
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'is_deleted:8600021356', '0'
put 't_operate_order_inventory_list', 'sn_20191106509325024585940992', 'inventory_ext:8600021356', ''

读取的时候通过order_no或者明细数据,然后根据列和列簇组装明细list,代码实现如下

public List<OperateOrderInventoryVO> queryInventory(InventoryQueryReq req) {
    String orderNo = req.getOrderNo();
    if (StringUtils.isEmpty(orderNo)) {
        throw new ServiceRuntimeException(Protos.createBadRequest("orderNo"));
    }

    // date check
    Date orderDate = DateUtil.formatDate(orderNo.substring(3, 11), "yyyyMMdd");
    if (DateUtil.calIntervalDay(System.currentTimeMillis(), orderDate.getTime()) > CommonConstant.DEFAULT_DATA_TIME_OPERATE_ORDER_INVENTORY_LIST) {
        // get from HBASE
        Get get = new Get(orderNo.getBytes());
        Result result = table.get(get);
        Cell[] cellList = result.rawCells();
        Map<String, OperateOrderInventoryVO> inventoryListMap = new HashMap<>();
        for (Cell cell : cellList) {
            OperateOrderInventoryVO list = null;
            String id = new String(CellUtil.cloneQualifier(cell));
            if (null == inventoryListMap.get(id)) {
                list = new OperateOrderInventoryVO();
            } else {
                list = inventoryListMap.get(id);
            }
            switch (new String(CellUtil.cloneFamily(cell))) {
                case "factory_sn":
                    list.setFactorySn(new String(CellUtil.cloneValue(cell)));
                    break;
                case "inventory_sn":
                    list.setInventorySn(new String(CellUtil.cloneValue(cell)));
                    break;
                case "inventory_spu":
                    list.setInventorySpu(new String(CellUtil.cloneValue(cell)));
                    break;
                case "inventory_sku":
                    list.setInventorySku(new String(CellUtil.cloneValue(cell)));
                    break;
                case "package_sn":
                    list.setPackageSn(new String(CellUtil.cloneValue(cell)));
                    break;
                case "inventory_status":
                    list.setStatus(Integer.parseInt(new String(CellUtil.cloneValue(cell))));
                    break;
                case "is_deleted":
                    list.setIsDeleted(Integer.parseInt(new String(CellUtil.cloneValue(cell))));
                    break;
                case "inventory_ext":
                    list.setExt(new String(CellUtil.cloneValue(cell)));
                    break;
            }
            inventoryListMap.put(id, list);
        }
        // filter
        List<OperateOrderInventoryVO> data = new ArrayList<>(inventoryListMap.values());
        if (!CollectionUtils.isEmpty(data)) {
            if (!StringUtils.isEmpty(req.getInventorySpu())) {
                data = data.stream().filter(i -> req.getInventorySpu().equals(i.getInventorySpu())).collect(Collectors.toList());
            }
            if (!StringUtils.isEmpty(req.getInventorySku())) {
                data = data.stream().filter(i -> req.getInventorySku().equals(i.getInventorySku())).collect(Collectors.toList());
            }
            if (!StringUtils.isEmpty(req.getPackageSn())) {
                data = data.stream().filter(i -> req.getPackageSn().equals(i.getPackageSn())).collect(Collectors.toList());
            }
        }
        return data;
    }

    // get from PG
    OperateOrderInventoryList query = new OperateOrderInventoryList();
    query.setRefSn(orderNo);
    query.setIsDeleted(DeleteEnum.EXISTED.value());
    query.setInventorySpu(req.getInventorySpu());
    query.setInventorySku(req.getInventorySku());
    query.setPackageSn(req.getPackageSn());
    List<OperateOrderInventoryList> inventoryList = inventoryListMapper.query(query);
    List<OperateOrderInventoryVO> inventoryVOList = new ArrayList<>();
    if (!CollectionUtils.isEmpty(inventoryList)) {
        inventoryList.forEach(i -> {
            OperateOrderInventoryVO v = new OperateOrderInventoryVO();
            BeanUtils.copyProperties(i, v);
            v.setOrderNo(i.getRefSn());
            inventoryVOList.add(v);
        });
    }
    return inventoryVOList;
}
  • 注意
    这种方案的前提是冷数据的访问量不大或者一次get的数据量不大,为了更好的利益HBASE,我们以order_no作为rowkey,这样可以直接命中数据,但如果一个单子明细很多的时候,因为所有明细数据都是在内存中处理的,会很耗内存,如果访问量很大或者一次get很多数据的话可能会出现OOM,请做好相应处理措施(比如限流)
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章