Flink 實戰(十一) Flink SideOutput 在風險預警場景下的應用

風險預警場景

背景

在風險預警的場景下,當預警消息需要發送給用戶的時候,往往會根據不同的預警等級通過不同的渠道對用戶進行內容的通知。

預警等級 預警渠道
重大風險 短信、釘釘
一般風險 短信、站內信
提示風險 站內信
正常 -

一般我們會把預警信息流先進行分割,然後發送到不同的kafka topic裏面或者數據庫中,供發送程序處理。

這裏對發送的優先級以及模板不做過多的處理。

示意圖如下。

image-20200606114117403

如果我們使用Flink的話,需要把一個流進行分割的話,需要使用一個叫做Side Output概念。

  • 除了從DataStream操作獲得的主流之外,您還可以產生任意數量的附加副輸出結果流。

  • 結果流中的數據類型不必與主流中的數據類型匹配,並且不同側輸出的類型也可以不同。

  • 當您想要拆分數據流時通常需要複製該數據流,然後從每個數據流中過濾掉不需要的數據,此操作將非常有用。

單個Side Output示例

您可以使用Context將數據發送到由OutputTag標識的SideOutput

ctx.output(outputTag, "sideout-" + String.valueOf(value));

下面是一個例子:

DataStream<Integer> input = ...;

final OutputTag<String> outputTag = new OutputTag<String>("side-output"){};

SingleOutputStreamOperator<Integer> mainDataStream = input
  .process(new ProcessFunction<Integer, Integer>() {

      @Override
      public void processElement(
          Integer value,
          Context ctx,
          Collector<Integer> out) throws Exception {
        // emit data to regular output
        out.collect(value);

        // emit data to side output
        ctx.output(outputTag, "sideout-" + String.valueOf(value));
      }
    });

爲了獲取側面輸出流,可以對DataStream操作的結果使用getSideOutput(OutputTag)。 這返回一個一個DataStream,輸入爲Side Output Stream的結果:

final OutputTag<String> outputTag = new OutputTag<String>("side-output"){};

SingleOutputStreamOperator<Integer> mainDataStream = ...;

DataStream<String> sideOutputStream = mainDataStream.getSideOutput(outputTag);

代碼實現


public class WarningSender {

    public static final String HIGH_RISK_LEVEL = "3";
    public static final String GENERAL_RISK_LEVEL = "2";
    public static final String PROMPT_RISK_LEVEL = "1";
    public static final String NO_RISK_LEVEL = "0";

    public static void main(String[] args) throws Exception {

        StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();

        // 重大風險流
        final OutputTag<WarningResult> highRiskTag = new OutputTag<WarningResult>("high-risk") {};
        // 一般風險
        final OutputTag<WarningResult> generaRiskTag = new OutputTag<WarningResult>("general-risk") {};
        // 提示風險
        final OutputTag<WarningResult> promptRiskTag = new OutputTag<WarningResult>("prompt-risk") {};


        final OutputTag<WarningResult> smsChannel = new OutputTag<WarningResult>("sms-channel") {
        };
        final OutputTag<WarningResult> dingChannel = new OutputTag<WarningResult>("ding-channel") {
        };
        final OutputTag<WarningResult> innerMsgChannel = new OutputTag<WarningResult>("inner-msg-channel") {
        };

        // Producer
        FlinkKafkaProducer011<String> smsProducer =
                new FlinkKafkaProducer011<String>("topic_sender_sms", new SimpleStringSchema(), KafkaSourceUtils.getKafkaSourceProp());
        FlinkKafkaProducer011<String> dingProducer =
                new FlinkKafkaProducer011<String>("topic_sender_ding", new SimpleStringSchema(), KafkaSourceUtils.getKafkaSourceProp());
        FlinkKafkaProducer011<String> innerMsgProducer =
                new FlinkKafkaProducer011<String>("topic_sender_inner_msg", new SimpleStringSchema(), KafkaSourceUtils.getKafkaSourceProp());


        DataStream<String> warningResultDataStream = env.addSource(new FlinkKafkaConsumer011<>("warning_result", new SimpleStringSchema(), KafkaSourceUtils.getKafkaSourceProp()));
        SingleOutputStreamOperator<WarningResult> mainDataStream = warningResultDataStream
                .map(v -> JSON.parseObject(v, WarningResult.class))
                .keyBy((KeySelector<WarningResult, String>) WarningResult::getMainId)
                .process(new KeyedProcessFunction<String, WarningResult, WarningResult>() {
                    @Override
                    public void processElement(WarningResult value, Context ctx, Collector<WarningResult> out) throws Exception {
                        // emit data to regular output
                        out.collect(value);

                        // emit data to side output
                        if (HIGH_RISK_LEVEL.equals(value.getLevel())) {
                            ctx.output(highRiskTag, value);
                        } else if (GENERAL_RISK_LEVEL.equals(value.getLevel())) {
                            ctx.output(generaRiskTag, value);
                        } else if (PROMPT_RISK_LEVEL.equals(value.getLevel())) {
                            ctx.output(promptRiskTag, value);
                        }
                    }
                });

        // 高風險處理
        SingleOutputStreamOperator<WarningResult> highRiskTagStream = mainDataStream
                .getSideOutput(highRiskTag)
                .keyBy((KeySelector<WarningResult, String>) WarningResult::getMainId)
                .process(new KeyedProcessFunction<String, WarningResult, WarningResult>() {
                    @Override
                    public void processElement(WarningResult value, Context ctx, Collector<WarningResult> out) throws Exception {
                        out.collect(value);
                        // 當然這裏可以做類型轉換。
                        ctx.output(smsChannel, value);
                        ctx.output(dingChannel, value);
                    }
                });
        highRiskTagStream.getSideOutput(smsChannel).map(JSON::toJSONString).addSink(smsProducer);
        highRiskTagStream.getSideOutput(dingChannel).map(JSON::toJSONString).addSink(dingProducer);

        // 一般風險處理
        SingleOutputStreamOperator<WarningResult> generalRiskTagStream = mainDataStream
                .getSideOutput(generaRiskTag)
                .keyBy((KeySelector<WarningResult, String>) WarningResult::getMainId)
                .process(new KeyedProcessFunction<String, WarningResult, WarningResult>() {
                    @Override
                    public void processElement(WarningResult value, Context ctx, Collector<WarningResult> out) throws Exception {
                        out.collect(value);
                        ctx.output(dingChannel, value);
                        ctx.output(innerMsgChannel, value);
                    }
                });
        generalRiskTagStream.getSideOutput(dingChannel).map(JSON::toJSONString).addSink(dingProducer);
        generalRiskTagStream.getSideOutput(innerMsgChannel).map(JSON::toJSONString).addSink(innerMsgProducer);

        // 提示風險處理
        SingleOutputStreamOperator<WarningResult> promptRiskTagStream = mainDataStream
                .getSideOutput(promptRiskTag)
                .keyBy((KeySelector<WarningResult, String>) WarningResult::getMainId)
                .process(new KeyedProcessFunction<String, WarningResult, WarningResult>() {
                    @Override
                    public void processElement(WarningResult value, Context ctx, Collector<WarningResult> out) throws Exception {
                        out.collect(value);
                        ctx.output(innerMsgChannel, value);
                    }
                });
        promptRiskTagStream.getSideOutput(innerMsgChannel).map(JSON::toJSONString).addSink(innerMsgProducer);


        System.out.println("<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<");
        System.out.println(env.getExecutionPlan());
        System.out.println("<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<");

        env.execute("job-warning-dispatcher");

    }


    /**
     * TODO can use config and gui configuration
     *
     * WarningConfig highConfig = new WarningConfig(HIGH_RISK_LEVEL, highRiskTag, Arrays.asList(smsChannel, dingChannel));
     *         WarningConfig generalConfig = new WarningConfig(GENERAL_RISK_LEVEL, generaRiskTag, Arrays.asList(dingChannel, innerMsgChannel));
     *         WarningConfig promptConfig = new WarningConfig(PROMPT_RISK_LEVEL, promptRiskTag, Arrays.asList(innerMsgChannel));
     */
    @AllArgsConstructor
    private static class WarningConfig {

        private String level;

        private OutputTag<WarningResult> outputTag;

        private List<OutputTag<WarningResult>> channels;

    }


    @Data
    public static class WarningResult {

        private String mainId;

        private String level;

        private String content;

        private Long ts;

    }

}

執行計劃

image-20200606123739507

測試

Producer

 public static void main(String[] args) throws InterruptedException {
        Properties props = new Properties();
        props.put("bootstrap.servers", "flinkhadoop:9092");
        props.put("acks", "1");
        props.put("retries", 0);
        props.put("batch.size", 10);
        props.put("linger.ms", 10000);
        props.put("buffer.memory", 33554432);
        props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        Producer<String, String> producer = new KafkaProducer<>(props);
        String[] strings = new String[1000000];
        long start = System.currentTimeMillis();


        for (int i = 0; i < 100; i++) {

            WarningSender.WarningResult warningResult = new WarningSender.WarningResult();
            warningResult.setMainId("main" + i);
            warningResult.setContent("不幹好事");
            warningResult.setLevel((i % 4) + "");
            warningResult.setTs(System.currentTimeMillis());

            String eventStr = JSON.toJSONString(warningResult);

            producer.send(new ProducerRecord<String, String>("warning_result", "wr" + i, eventStr), new Callback() {
                @Override
                public void onCompletion(RecordMetadata recordMetadata, Exception e) {
                    System.out.println(recordMetadata);
                    if (e != null) {
                        e.printStackTrace();
                    }
                }
            });
//            Thread.sleep(1000L);
        }
        long end = System.currentTimeMillis();
        System.out.println("send use time : [" + (end - start) + "]");
        Thread.currentThread().sleep(40000);
        producer.close();
    }
}

consumer

public static void main(String[] args) throws InterruptedException {
        Properties props = new Properties();
        // kafka 服務器地址
        props.put("bootstrap.servers", "flinkhadoop:9092");
        // 消費者組
        props.put("group.id", "tes23t1");
        // 定時的提交offset的值
        props.put("enable.auto.commit", "true");
        props.put("auto.offset.reset", "earliest");
        // 設置上面的定時的間隔
        props.put("auto.commit.interval.ms", "1000");
        // 連接保持時間,如果zookeeper在這個時間沒有接收到心跳,會認爲此會話已經掛掉
        props.put("session.timeout.ms", "30000");
        // key 反序列化策略
        props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        // value 反序列化策略
        props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
        
//        consumer.subscribe(Collections.singletonList(KafkaConstants.DEMO_TOPIC + "_flink"));
        consumer.subscribe(Arrays.asList("topic_sender_ding"));

        while (true) {
            ConsumerRecords<String, String> records = consumer.poll(100);
            //System.out.println("-----------------");
            records.forEach(record -> System.out.printf("partition = %d, offset = %d, key = %s, value = %s\n", record.partition(), record.offset(), record.key(), record.value()));
        }
    }
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章